Save X_test at inference time for use in train

Can we save the X_test passed to the infer method under the resource directory and use it in the train method?
It can take advantage of X in the embargo period, which is not originally available during the study, but it also seems somewhat sneaky.
But, indeed, it is different from leakage because it does not use the embargo period objective variable y.

To be sure I understand, you start training from date 0 to n, and you infer on date n+2, as n+1 is embargo.

For the next iteration, the current system allows you to train from date 0 to n+1, n+2 being the embargo date, and infer on n+3.
You would like to make use of X_{n+2} (not y_{n+2}, as you say) in order to perform the inference at date n+3.

Is this correct? If so, I think it’s an interesting like of thought. Do you need something from us to experiment with this? Ccing @enzo, which tells me it’s technically possible.

Yes, your comment is exactly what I want to do.
I wanted to make sure that it is not a problem in terms of terms and conditions.

Could you also parse the train method with X_embargo added as an argument on the server side?
It would be nice to have this form of train method available as an option.

No look-ahead, no problem with the terms and conditions.

For the development of a new feature this late in the competition, we would like not to introduce things that, even without changing the rules, could advantage someone over someone else.

If you can dump the X_{n+2} when you do the inference on it and load it the iteration after, I feel that would be fair for the rest of the players. If you cannot, we can discuss this with @enzo.

Please let me know

I thought the method you described was the most fair. I felt that the method I requested was indeed unfair. I will try the dump method within the infer method.
Thank you for the consultation!


Just checking to make sure.

When phase two of the competition starts, at the start, models can be trained on the data from date 0 to 299. Is that correct?

Yes, that what gonna happen at the first out-of-sample week.
You will be trained from date 0 to 298, and will have to infer on moon 300.

Then it continue normally:

  • 0-299, infer on 301
  • 0-300, infer on 302
  • 0-301, infer on 303
  • 0-333, infer on 335
  • end