Metric evaluation for test data

So, there can be two ways to compute metric over test data:

  1. Compute spearman coefficient for each date and then average it over all dates.

  2. Combine predictions and targets for complete test period and then compute spearman coefficient over all data.

So, which one is used for test data?

Thanks!

The objective of the problem is to rank out-of-sample per cross-section, hence the out-of-sample fitness function is the average spearman correlation, computed for each date.