Your response does not sufficiently address our questions or provide meaningful insights. As we previously noted, this is simply a toy example, where you could assume that the data is normalized, scaled, and filtered to include the most variable features. Its primary purpose is to illustrate that a high Spearman cell-wise correlation does not necessarily translate to a high Spearman gene-wise correlation.
Additionally, you are again normalizing the prediction (Y_hat) - a practice we have already discussed as problematic. Re-apply normalization - #5 by many-kalin
Considering this was intended to be a scientific challenge, it remains unclear why performance is being evaluated cell-wise instead of gene-wise, as is commonly done in the literature [1,2,3,4,5,6] and as the end goal is to distinguish between healthy and disease populations (crunch 3). Moreover, in Crunch 3, the focus is on ranking genes based on their predictive value within the population (many cells) across both conditions, rather than identifying “which cell is the single most diseased one”.
[1] Jaume, G., Doucet, P., Song, A.H., Lu, M.Y., Almagro-Pérez, C., Wagner, S.J., Vaidya, A.J., Chen, R.J., Williamson, D.F., Kim, A. and Mahmood, F., 2024. Hest-1k: A dataset for spatial transcriptomics and histology image analysis. arXiv preprint arXiv:2406.16192.
[2] Xie, R., Pang, K., Chung, S., Perciani, C., MacParland, S., Wang, B. and Bader, G., 2024. Spatially Resolved Gene Expression Prediction from Histology Images via Bi-modal Contrastive Learning. Advances in Neural Information Processing Systems, 36.
[3] He, B., Bergenstråhle, L., Stenbeck, L., Abid, A., Andersson, A., Borg, Å., Maaskola, J., Lundeberg, J. and Zou, J., 2020. Integrating spatial gene expression and breast tumour morphology via deep learning. Nature biomedical engineering, 4(8), pp.827-834.
[4] Jia, Y., Liu, J., Chen, L., Zhao, T. and Wang, Y., 2024. THItoGene: a deep learning method for predicting spatial transcriptomics from histological images. Briefings in Bioinformatics, 25(1), p.bbad464.
[5] Pang, M., Su, K. and Li, M., 2021. Leveraging information in spatial transcriptomics to predict super-resolution gene expression from histology images in tumors. BioRxiv, pp.2021-11.
and many more
[6] Schmauch, B., Herpin, L., Olivier, A., Duboudin, T., Dubois, R., Gillet, L., Schiratti, J.B., Di Proietto, V., Le Corre, D., Bourgoin, A. and Taïeb, J., 2024. A deep learning-based multiscale integration of spatial omics with tumor morphology. bioRxiv, pp.2024-07.