How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
I got following error after initialisation of queue:
File "/tmp/ray/session_2023-03-11_20-32-22_575180_88/runtime_resources/working_dir_files/_ray_pkg_b7b314c25cc80b56/tools/detection_actor.py", line 82, in start_job
await self.det_holders[job_id].put_async(DetectionObj(dets, job_id))
File "/usr/local/lib/python3.8/dist-packages/ray/util/queue.py", line 132, in put_async
await self.actor.put.remote(item, timeout)
ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task.
class_name: _QueueActor
actor_id: 7ebe8644ea290e8661bbe7fd07000000
pid: 1995
name: 440_dh
namespace: raypipe
ip: 172.16.30.130
The actor is dead because its worker process has died. Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
The actor never ran - it was cancelled before it started running.
Here, first I am initialising queue named “det_holder” and pass it to two different actors, first detection actor adds detection object in that, another tracking actor read that object from queue.
After queue initialisation, for first frame only while detection actor tried to put object, it threw mentioned error. Which means, ray couldn’t initialise it properly.
I could not regenerate this issue again. Can anyone suggest, what can be reason/cause behind this?