I have been getting this error for my submissions.
I tried specifying the library versions I use in my local setup, and also tried not specifying the versions (because crunch cli is usiing pandas and pyarrow so I was suspecting that I may be messing that up), neither helped.
Could you help me understand what’s going on before I reach the submission quota limits please?
Here’s the longer traceback
File "/usr/local/lib/python3.10/site-packages/crunch/runner/cloud.py", line 528, in sandbox
9:51:49 AM
return utils.read(self.prediction_path)
9:51:49 AM
File "/usr/local/lib/python3.10/site-packages/crunch/utils.py", line 125, in read
9:51:49 AM
return pandas.read_parquet(path, **kwargs)
9:51:49 AM
File "/usr/local/lib/python3.10/site-packages/pandas/io/parquet.py", line 667, in read_parquet
9:51:49 AM
return impl.read(
9:51:49 AM
File "/usr/local/lib/python3.10/site-packages/pandas/io/parquet.py", line 274, in read
9:51:49 AM
pa_table = self.api.parquet.read_table(
9:51:49 AM
File "/usr/local/lib/python3.10/site-packages/pyarrow/parquet/core.py", line 1793, in read_table
9:51:49 AM
dataset = ParquetDataset(
9:51:49 AM
File "/usr/local/lib/python3.10/site-packages/pyarrow/parquet/core.py", line 1360, in __init__
9:51:49 AM
[fragment], schema=schema or fragment.physical_schema,
9:51:49 AM
File "pyarrow/_dataset.pyx", line 1431, in pyarrow._dataset.Fragment.physical_schema.__get__
9:51:49 AM
File "pyarrow/error.pxi", line 155, in pyarrow.lib.pyarrow_internal_check_status
9:51:49 AM
File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
9:51:49 AM
pyarrow.lib.ArrowInvalid: Could not open Parquet input source '<Buffer>': Parquet file size is 0 bytes