Throughout Crunch 1 and Crunch 2, there have been various discussions and changes regarding the evaluation metrics (MSE vs Pearson vs Spearman). However, it remains unclear which metrics will be used. There are also concerns that the current metrics may not align with state-of-the-art metrics referenced in recent literature (cell-wise vs gene-wise correlation).
It is challenging to optimize models effectively without a clear understanding of the challenge objectives. Could you kindly provide more details on the evaluation metrics soon?
@enzo@cruncher-abde Would it be possible to add the Spearman computed gene-wise to the leaderboard as a side column just to look at it while we are waiting for decision? It requires a small tweak in the code and anyway the metrics are not decided yet. Maybe it will be even helpful for the final decision.
Hi, here is a gene-wise implementation similar to your cell-wise implementation. Also, I am including a second shorter implementation with less transformation - just utilizing pandas.
import pandas
import numpy
import scipy
def _spearman_cell_wise(
prediction: pandas.DataFrame,
y_test: pandas.DataFrame,
):
cell_count = len(y_test.index)
weight_on_cells = numpy.ones(cell_count) / cell_count
A = y_test.to_numpy()
B = prediction.to_numpy()
rank_A = scipy.stats.rankdata(A, axis=1)
rank_B = scipy.stats.rankdata(B, axis=1)
corrs_cell = (
numpy.multiply(rank_A - numpy.mean(rank_A), rank_B - numpy.mean(rank_B)).mean(axis=1)
/ (numpy.std(rank_A, axis=1) * numpy.std(rank_B, axis=1))
)
corrs_cell[numpy.isnan(corrs_cell)] = 0
return numpy.sum(weight_on_cells * corrs_cell)
def _spearman_gene_wise(
prediction: pd.DataFrame,
y_test: pd.DataFrame,
):
# Ensure that y_test and prediction have the same number of columns
assert prediction.shape[1] == y_test.shape[1], "Prediction and y_test must have the same number of features (columns)"
feature_count = prediction.shape[1]
weight_on_features = np.ones(feature_count) / feature_count
# Convert DataFrames to numpy arrays for easier manipulation
A = y_test.to_numpy()
B = prediction.to_numpy()
# Compute ranks for both y_test and prediction for each feature (column-wise)
rank_A = scipy.stats.rankdata(A, axis=0) # Rank along features (columns)
rank_B = scipy.stats.rankdata(B, axis=0)
# Compute Spearman correlation for each feature
corrs_feature = (
np.multiply(rank_A - np.mean(rank_A, axis=0), rank_B - np.mean(rank_B, axis=0)).mean(axis=0)
/ (np.std(rank_A, axis=0) * np.std(rank_B, axis=0))
)
# Handle any NaNs in correlation values (can happen if there's no variation in a feature)
corrs_feature[np.isnan(corrs_feature)] = 0
# Return the weighted sum of feature-wise correlations
return np.sum(weight_on_features * corrs_feature)
def _spearman_cell_wise(
prediction: pandas.DataFrame,
y_test: pandas.DataFrame,
):
score = A.corrwith(B, method="spearman", axis=1).fillna(0).mean()
return score
def _spearman_gene_wise(
prediction: pandas.DataFrame,
y_test: pandas.DataFrame,
):
score = A.corrwith(B, method="spearman", axis=0).fillna(0).mean()
return score
I encourage you to use the second simpler implementation, which only utilizes pandas. You could also easily switch to the person correlation if needed.
I originally just implemented what the broad team gave me.
Making a mistake here would be horrible. Just copy-pasting their code make my responsability lighter.