When I was doing multi-environment sampling, I found that as the number of environments increased, the sampling efficiency dropped sharply. I don't know the specific reason?

  • High: It blocks me to complete my task.

My multi-environment configuration is as follows:
“num_workers”: 5, #2,
“num_envs_per_worker”: 5,
“num_gpus_per_worker”: 0.1,
“num_cpus_per_worker”: 10,
“sample_async”: True,
In a single environment, 20 data can be sampled in one second, but when the number of environments reaches 25, only 40 data can be sampled in one second. I tried to configure remote_worker_envs and it can’t solve the problem

when the number of environments reaches 25

Does this mean that num_envs_per_worker increased to 25? or num_envs_per_worker x num_workers?

Where do you measure the sampling frequency? Are these values from tensorboard? What are their exact names of the metrics? The “simple and fits most cases” setup would be:

num_workers = N-1, # N is number of CPU cores available to you
num_cpus_per_worker = 1,  # If your environment does not require lots of CPUs but is for example a simple single threaded Atari Env, 1 is most efficient
num_envs_per_worker = M # Choose M such that the your samples are distributed well enough so that you algorithm does not optimize in a "bad" direction. Your rollouts shoulds represent the distribution of transitions under your policy and the environment at hand well enough for optimization to make sense (Optimizing on a single episode of a single environment can be pretty suboptimal. Be aware that rollout fragment length and in recurrent cases sequence lengths are also relevant here.)

If there is still trouble, please post two small reproductions script that exactly show what numbers you are changing and what effect you would expect vs what happens on your side.

Hello, the sampling frequency is calculated by dividing T sub S by total time


Above is ray’s log output while training. I don’t know whether it is related to the size of GPU in my training process. The training speed cannot keep up with the sampling speed. Where do I look if I want to see the actual sampling frequency