Ray’s built-in ActorPool does not support distributed work stealing or shared state across multiple remote callers; each ActorPool instance maintains its own local state, so it cannot coordinate load or enable true work stealing in a distributed setting. There is no built-in Ray equivalent to asyncio.Future for lightweight, distributed job tracking and resolution between actors, and Ray Queues are not designed for work stealing or actor load tracking either. According to this discussion, the only way to share an ActorPool is to wrap it in a dedicated manager actor, but this still centralizes coordination and does not provide distributed work stealing.
If you cannot use a single coordinator, you will need to implement a custom solution, such as a shared Ray Queue or a custom manager actor that tracks job assignments and actor load, but this is not natively supported for work stealing. Ray Datasets or Ray Data Pipelines may offer better built-in support for distributed load balancing and backpressure, as suggested in this thread. Would you like more detail on how to implement such a custom manager or alternative patterns?
Sources:
Hint: Mention @RunLLM in the post for followups.