EnvRunners vs VectorEnvs at Scaled Networking Distribution

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

We are currently scaling simulators with Ray for gathering experience across our network infrastructure. What is the recommended path for RLLib to implement this?

Currently we integrated the comm stack in the VectorEnvs through a customized EnvRunner and VectorEnv implementation, but we feel that this causes quite some overhead, requiring a separate thread and event loop for each of the sims. The need for event loops, then creates a downpropagating issue with thread context ownerships and gRPC being sensitive to it.

Would it make sense to just not use a VectorEnv and instead use EnvRunners in a 1:1 fashion (1 env runner <> 1 sim) as scalable actors and communicators? Or are there any other best practices we should look at?

Note: the networking overhead is negligible here as the sims tend to be VERY Slow and resource intensive (hence why we think EnvRunners might suffice).