Why inference taking longer than expected?

curious-ali · September 14, 2025, 7:11pm

Hi, I’m curious why my last run took so much time. Each dataset prediction usually takes less than 2 minutes on your server. My model doesn’t require any training—just inference. What could be the reason? run#52977

enzo · September 14, 2025, 7:27pm

Python buffering is bugged and logs are not appearing right away…

Your code is running, and likely without issues, but the logs are just not showing.

curious-ali · September 15, 2025, 5:33am

Could you please check the logs and let me know what happened?

curious-ali · September 15, 2025, 8:20am

I used tqdm to track the execution time and noticed that more than 100 datasets are being used to evaluate the models. How many datasets are actually used for evaluation? As I mentioned, my code only performs inference and does not use any training datasets at all. I have observed that inference for each dataset takes less than one minute. Given this, is the 15-hour budget sufficient for these specifications?

enzo · September 15, 2025, 9:38am

I did not find any issue with your logs now. I do agree they took a long time to appear.

In the cloud, you are given a X_test of 10’000 datasets, which is 100x what you have locally.

Given your saying of less than 2 minutes per datasets:

2 minutes * 10’000 datasets = 20’000 minutes ~= 333.33 hours

But taking a more realistic number from your logs of 40-50s per datasets:

45 seconds * 10’000 datasets = 450’000 minutes = 125 hours

Yes, it is still too much.
Even if we could give you more quota, it would still be more than 5 days of continuous compute.

enzo · September 15, 2025, 9:40am

Also, normally the system cut the logs after the first 1000 lines and will show the last 500 lines when over, so we will be blind for most of it as your tqdm is printing one line per dataset.

This usually don’t happen as models go through the dataset faster than the 10s minimum update interval we enforce in the cloud environment.

curious-ali · September 15, 2025, 10:38am

Thank you for the information. Considering the 15-hour maximum budget for running on this large number of datasets, the inference time for each dataset must be less than 5.4 seconds. This means that most probabilistic methods are not suitable for this competition, even if they could potentially yield better results.

enzo · September 15, 2025, 10:52am

Indeed, a participant must be able to optimize his code so it can run whithin the budget.

Another thing is that your model might not need to be re-trained as the training data will likely not change, so the 15 hours could be dedicated for infering only.