@sangcho I posted the issue to a new thread Periodic _MultiThreadedRendezvous failure leaves cluster in damaged state and mentioned you there. Thanks!
djakubiec
17
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| (raylet) Some workers of the worker process(68497) have not registered within the timeout. The process is still alive, probably it's hanging during start | 4 | 2847 | May 26, 2022 | |
| Ray Head restarting and leaving behind zombie processes | 0 | 190 | March 12, 2024 | |
| Subset of tasks stuck in "PENDING_NODE_ASSIGNMENT" forever | 9 | 2455 | May 25, 2023 | |
| Error while stopping a job in a ray cluster Check failed: addr_proto.worker_id() != "" | 0 | 23 | June 30, 2024 | |
| Ray Actor Dying unexpectedly | 8 | 4251 | October 21, 2022 |