Hello, I have few doubts regarding the interaction of actors to environments.
Case 1: num_workers:1 and num_envs_per_worker:5 and remote_worker_envs:False
In this case one actor critic policy is interacting with 5 environments.(i hope actor and worker are same)
Case 2: num_workers:5 and num_envs_per_woker:1
In this case 5 policies are interacting with 5 environments.
My question is are the policies interaction in parallel or sequential , how can I know? and what makes difference in setting remote_worker_envs.True when num_envs_per_worker>1 ?
Please can anyone give me clarification on this topic.