Why does my code always hang at time series #994 during inference?

artur · September 17, 2025, 12:15pm

I am running inference on a batch of time series with a trained model. Everything works fine up to time series #993: each one is processed, classified, and a probability is returned.

But when the loop reaches #994 (task nr. 1212), the code hangs indefinitely. Nothing is printed (I tried print statements and even tqdm, but no output appears). It just sits there until ~15 hours later, when the job fails with a time limit error.

What’s strange is that locally I can run inference on local sets (e.g. 10,001 training series or 101 test series) without any issues. The models are already trained, so this is inference only — no training involved.

Could there be something special about time series #994 — maybe some pattern or feature that never appeared in the training or test data — that causes the code to freeze?

Has anyone experienced something similar, or do you have advice on how to debug this specific series?

enzo · September 17, 2025, 1:51pm

Multiple thing:

The logs are limited to the first 1000 lines and last 500 lines. So the “hang” is likely your code printing too much. The progress bars are nefarious for that. I suggest you only print every 1000 datasets that you are indeed running.
You need to infer on 10’000 datasets, not just 100 like you did locally (overview was updated recently). This is likely the cause of your timeout.

Topic		Replies	Views
Why inference taking longer than expected? ADIA Lab	7	80	September 15, 2025
Submission runs getting stuck after a certain time during inference ADIA Lab	3	106	August 19, 2025
Results of Timed Out Runs ADIA Lab quota	6	135	September 4, 2025
Code executions not finishing ADIA Lab cloud	1	51	September 30, 2025
The run stops after 10 minutes despite no errors in the code ADIA Lab	1	43	October 6, 2024

Why does my code always hang at time series #994 during inference?

Related topics