(raylet) object_manager.cc:293: Couldn't send pull request from

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

In a high IO scenario, ray objects will transfer between actors continually.
I get the below error and then the job is hang.

(raylet, ip=[fdbd:dc01:27:184:e700::2e]) [2023-01-19 17:35:10,671 E 25 25] (raylet) object_manager.cc:293: Couldn’t send pull request from eb7cb2f5c63ad82f69fc16cee6a752927b9c27d4100a98d23d130790 to f897d1e2e760002dac5863d2cfd1f9ce90da1552608462c7929bed0b of object 004a595b632bc88019a74f5a3167fe12b40f6ed50700000003000000 , setup rpc connection failed.

Are there any suggestion to judge:

  1. is the object is destroyed already
  2. is the raylet of these object is down
  3. or other similiar condition when you meet these error

Thanks

Likely you should see other error messages for (1) and (2).

setup rpc connection failed makes it look like there is some issue with setting up the connection between raylets, maybe due to stress. Can you file a github issue for this one and if possible provide a reproducible example? Seems like this is a bug.

Ok, I will try to split a reproducible example from my code and file one issue