I am running inference on a batch of time series with a trained model. Everything works fine up to time series #993: each one is processed, classified, and a probability is returned.
But when the loop reaches #994 (task nr. 1212), the code hangs indefinitely. Nothing is printed (I tried print statements and even tqdm, but no output appears). It just sits there until ~15 hours later, when the job fails with a time limit error.
What’s strange is that locally I can run inference on local sets (e.g. 10,001 training series or 101 test series) without any issues. The models are already trained, so this is inference only — no training involved.
Could there be something special about time series #994 — maybe some pattern or feature that never appeared in the training or test data — that causes the code to freeze?
Has anyone experienced something similar, or do you have advice on how to debug this specific series?
