Is Spearman's rank correlation the right metric for benchmarking?

soviet-manfred · December 9, 2024, 9:49am

We can see in the actual leaderboard (09.12.24) that there would be no difference between using MSE or spearman’s correlation for ranking the submissions. But there is one outlier: this is the submission “many-kalin / deepspot” which has the worst MSE(=0.513) of all submissions, but is ranked on the 3rd place. Perhaps pearson correlation would be better.

enzo · December 9, 2024, 10:35am

The reason was shared in the announcement.

The EWSC at the Broad Institute has decided to experiment with two scoring metrics:

→ MSE (Mean Squared Error)
→ Spearman Correlation

This approach is being tested to determine which metric will provide the best evaluation for participants. Broad will finalize which score will be used for the next checkpoint and the final scoring of broad-1.

soviet-manfred · December 9, 2024, 11:32am

Thank you for the immediate answer.

Topic		Replies	Views
Unknown evaluation metric Broad Institute Crunch #1	5	107	January 3, 2025
Is the MSE the right metric for benchmark? Broad Institute Crunch #1	15	237	January 14, 2025
The current scoring is unstable Broad Institute Crunch #2	7	106	February 20, 2025
Crunch 1 learderboard ranking Broad Institute Crunch #1	5	98	January 4, 2025
How is the ranking being determined in crunch 2? Broad Institute Crunch #2	5	72	March 11, 2025

Is Spearman's rank correlation the right metric for benchmarking?

Related topics