Hi, the above 2 genes are missing in the shared reference atlas → The overlap is only 458 genes instead of 460. Could you confirm this?
Please find Broad’s response below:
Hi Abdennour,
IL9 and PRG2 are not in the 2000 gene list that need to be predicted for Crunch2.
IL9 is not in the Crunch2_scRNAseq.h5ad file. PRG2 has a synonym called MBP, which is in the dataset.
I think participants are free to evaluate the quality of the provided dataset and include additional data if needed.
There is no MBP in Crunch2_scRNAseq.h5ad.
The number 9407 gene is MBP in Crunch2_scRNAseq.h5ad. Using following code, you will find it.
scRNAseq = scanpy.read_h5ad(os.path.join(data_directory_path, 'Crunch2_scRNAseq.h5ad'))
print(scRNAseq.var.index.tolist().index('MBP'))