My approach (4.22 CV / 3.704 public LB)

salty-francisco · August 18, 2023, 9:28am

Approach

I’d like to detail my approach and share the validation results.
All validations results are calculated as an average from moon 50 to moon 268.

I used four types of models:

LightGBM (CV 3.74) (100 features) (loss rmse)
ExtraTreesRegressor (CV 3.65) (100 features) (loss squared_error)
Catboost (CV 3.68) (89 features) (loss rmse)
NN (CV 3.56) (89 features) (loss mse)

I selected the features with forward feature selection.

Validation and Results:
Through stacking these models I got CV 4.22 (LB 3.704).

Target transformation
np.log1p(target)

Extra features
Number of IDs in the previous moons

CV-LB consistency
I found that big improvements were reflected in LB but sometimes small improvements were not reflected due to certain degree of randomness that exists

Historical data and retraining
I found that just using last 10 moons for training gives the best results in my case, so I put 1 retrain every moon and the training data is the last 10 moons.

Observation on Randomness:
During the course of the competition, my best public submission (4.15 LB) was obtained inadvertently due to a bug in the code. This discovery sheds light on the presence of a certain degree of randomness inherent in the competition.

subtle-rajat · August 18, 2023, 4:46pm

What was the train frequency set?

salty-francisco · August 18, 2023, 5:18pm

I train it every moon because my training time is short.

narrow-oskar · August 21, 2023, 12:31pm

The CV scores you presented were calculated using a rolling-origin CV (retrain on each date using last 10 dates). Correct?

salty-francisco · August 21, 2023, 12:41pm

yes, exactly!

anxious-james · August 25, 2023, 4:04pm

Can you explain the rationale behind this target transformation?

tropical-lucas · August 27, 2023, 5:57pm

I am curious now. How are the features selected ?

Topic		Replies	Views
[Structural Break] CV vs LB ADIA Lab	2	205	July 13, 2025
Current score is from how many time periods ADIA Lab	5	429	August 14, 2023
I didn't see any score for out of sample ADIA Lab	20	288	August 29, 2023
Questions on what is Y ( target) and few others regarding training data and features ADIA Lab	1	404	June 6, 2023
Does the model retraining work correctly? ADIA Lab	4	164	August 14, 2023

My approach (4.22 CV / 3.704 public LB)

Related topics