RLLib: How to parallelize learning different scenarios correctly?

Yes, assigning different scenarios to each env_runner based on env_runner.worker_index is a correct and recommended approach in RLlib. The env_config passed to each environment instance includes worker_index, which you can use to select the scenario for that environment. This is documented in the official RLlib environment guide, which shows how to use config.worker_index to customize each environment instance, ensuring each EnvRunner always runs its assigned scenario. Using a connector is not necessary for this use case; the current method is standard and efficient for scenario assignment in parallelized training setups, including multi-agent PPO.

For example, you can implement this by customizing your environment’s constructor to select the scenario based on config.worker_index, as shown in the RLlib documentation:

class EnvDependingOnWorkerAndVectorIndex(gym.Env):
    def __init__(self, config):
        scenario = choose_scenario_for(config.worker_index)
        # Initialize env with scenario

Sources:

Hint: Mention @RunLLM in the post for followups.