Expanding RLlib learning environment with multiple simulators and machines while reducing communication overhead

I started using rllib to solve a problem, but I’m having difficulty expanding the learning environment. I’m using a gymnasium custom environment (gym.env) and a simulator, and learning with rllib, tune, and SAC algorithms.

Now, I want to use RAY’s capabilities to run multiple simulators and environments on multiple machines to speed up learning. However, due to licensing issues, only one simulator can be run on one computer, so I want to reduce communication overhead by having a rollout worker and a simulator on one computer.

In conclusion, I want to get a trainer worker that learns using a GPU and a separate rollout worker configuration for each computer. I have tried the following:

  1. Ray cluster When I ran a script with num_rollout_workers = n on the head node, the rollout worker was not pinned to a particular computer but was placed automatically by the scheduler.
  2. Client-server I tried to write a rollout worker client script and a trainer server script. However, it was difficult, and finding reference materials was also challenging.

What is the best way to solve the problem in the above situation? Thank you in advance.