Save X_test at inference time for use in train

multiple-masahiro · August 10, 2023, 5:38pm

Can we save the X_test passed to the infer method under the resource directory and use it in the train method?
It can take advantage of X in the embargo period, which is not originally available during the study, but it also seems somewhat sneaky.
But, indeed, it is different from leakage because it does not use the embargo period objective variable y.

resonant-poincare · August 10, 2023, 6:49pm

To be sure I understand, you start training from date 0 to n, and you infer on date n+2, as n+1 is embargo.

For the next iteration, the current system allows you to train from date 0 to n+1, n+2 being the embargo date, and infer on n+3.
You would like to make use of X_{n+2} (not y_{n+2}, as you say) in order to perform the inference at date n+3.

Is this correct? If so, I think it’s an interesting like of thought. Do you need something from us to experiment with this? Ccing @enzo, which tells me it’s technically possible.

multiple-masahiro · August 11, 2023, 1:39am

Yes, your comment is exactly what I want to do.
I wanted to make sure that it is not a problem in terms of terms and conditions.

Could you also parse the train method with X_embargo added as an argument on the server side?
It would be nice to have this form of train method available as an option.

resonant-poincare · August 11, 2023, 8:44am

No look-ahead, no problem with the terms and conditions.

For the development of a new feature this late in the competition, we would like not to introduce things that, even without changing the rules, could advantage someone over someone else.

If you can dump the X_{n+2} when you do the inference on it and load it the iteration after, I feel that would be fair for the rest of the players. If you cannot, we can discuss this with @enzo.

Please let me know

multiple-masahiro · August 11, 2023, 1:50pm

I thought the method you described was the most fair. I felt that the method I requested was indeed unfair. I will try the dump method within the infer method.
Thank you for the consultation!

bigfish · August 12, 2023, 6:27pm

@resonant-poincare

Just checking to make sure.

When phase two of the competition starts, at the start, models can be trained on the data from date 0 to 299. Is that correct?

enzo · August 14, 2023, 8:12am

Yes, that what gonna happen at the first out-of-sample week.
You will be trained from date 0 to 298, and will have to infer on moon 300.

Then it continue normally:

0-299, infer on 301
0-300, infer on 302
0-301, infer on 303
…
0-333, infer on 335
end

Topic		Replies	Views
Leakage across Dates? ADIA Lab	6	317	June 23, 2023
Regarding code for submission ADIA Lab	5	420	June 7, 2023
I didn't see any score for out of sample ADIA Lab	20	248	August 29, 2023
What happens if the allocated time limit of 16 hours is breached? ADIA Lab	11	272	August 25, 2024
Is the cross sectional data at each date available in time to use as a prediction? ADIA Lab	1	196	July 23, 2023

Save X_test at inference time for use in train

Related topics