ObjectFetchTimedOutError

1. Severity of the issue: (select one)
High: Completely blocks me.

2. Environment:

  • Ray version: 2.23
  • Python version: 3.10
  • OS:
  • Cloud/Infrastructure: managed k8s
  • Other libs/tools (if relevant):

3. What happened vs. what you expected:

  • Expected: MapBatches(BatchPredictor) should run for all the blocks.
  • Actual: Abruptly failing with ObjectFetchTimedOutError after running for 6hours. Each parquet file is of apprx 14 mb, and we have huge number of files (apprx 17K) .



event_log":" File \"/home/ray/anaconda3/lib/python3.10/site-packages/ray/data/dataset.py\", line 4625, in materialize"}
event_log":" File \"/home/ray/anaconda3/lib/python3.10/site-packages/ray/data/exceptions.py\", line 86, in handle_trace"}
event_log":"2025-06-07 08:21:45,736\tINFO cli.py:83 -- Status message: Job entrypoint command failed with exit code 1, last available logs (truncated to 20,000 chars):"}
event_log":"ray.exceptions.ObjectFetchTimedOutError: Failed to retrieve object 1ae8e0c85369413affffffffffffffffffffffff0400000002000000. To see information about where this ObjectRef was created in Python, set the environment variable RAY_record_ref_creation_sites=1 during `ray start` and `ray.init()`."}
event_log":"\u001b[36m(MapWorker(MapBatches(BatchPredictor)) pid=9532, )\u001b[0m length of batch 256\u001b[32m [repeated 7x across cluster]\u001b[0m"}
event_log":"Fetch for object 1ae8e0c85369413affffffffffffffffffffffff0400000002000000 timed out because no locations were found for the object. This may indicate a system-level bug."}
event_log":"2025-06-07 08:21:45,736\tERR cli.py:70 -- \u001b[31m---------------------------------------------\u001b[39m"}
event_log":" raise e.with_traceback(None) from SystemException()"}
event_log":" copy._plan.execute(force_read=True)"}
event_log":"ray.exceptions.RayTaskError(ObjectFetchTimedOutError): \u001b[36mray::MapBatches(BatchPredictor)()\u001b[39m (pid=9288, ip=, actor_id=6a0d3d23a3c4bbc5c91b323804000000, repr=MapWorker(MapBatches(BatchPredictor)))"}
event_log":" At least one of the input arguments for this task could not be computed:"}
event_log":" File \"/app/inference.py\", line 233, in <module>"}
event_log":" raise e.with_traceback(None) from SystemException()"}
event_log":" copy._plan.execute(force_read=True)"}
event_log":"The above exception was the direct cause of the following exception:"}
event_log":"2025-06-07 08:21:45,735\tERR cli.py:68 -- \u001b[31m---------------------------------------------\u001b[39m"}
event_log":"ray.exceptions.RayTaskError(ObjectFetchTimedOutError): \u001b[36mray::MapBatches(BatchPredictor)()\u001b[39m (pid=9288, ip=, actor_id=6a0d3d23a3c4bbc5c91b323804000000, repr=MapWorker(MapBatches(BatchPredictor)))"}
event_log":" (ds.map_batches(BatchPredictor,"}
event_log":"- Write: 0 active, 0 queued, [cpu: 0.0, objects: 0.0B]: 100%|█████████▉| 304/305 [6:07:43<01:40, 100.38s/it] \u001b[A\u001b[A\u001b[A"}
event_log":" File \"/home/ray/anaconda3/lib/python3.10/site-packages/ray/data/dataset.py\", line 2818, in write_parquet"}
event_log":" self.write_datasink("}
event_log":"\u001b[36m(MapWorker(MapBatches(BatchPredictor)) pid=9532, ip=)\u001b[0m length of batch 256\u001b[32m [repeated 7x across cluster]\u001b[0m"}
event_log":" \u001b[A\u001b[A\u001b[A2025-06-07 08:21:40,093\tERROR exceptions.py:73 -- Exception occurred in Ray Data or Ray Core internal code. If you continue to see this error, please open an issue on the Ray project GitHub page with the full stack trace below: https://github.com/ray-project/ray/issues/new/choose"}

Hi Rags and welcome to the Ray community! Can you try ray memory --stats-only to see how much memory your ops are using, if it’s 80%, consider adding more resources. Do any of your other logs show anything unusual?

Someone else was running into a similar issue here: Fetch for object reference timed out because no locations were found for the object - #3 by Buvaneash

There were also a few bug fixes for object store stuff that were shipped in the later 2.4X versions of Ray, so maybe you can also try upgrading your Ray version and seeing if that fixes any issues.