Vectorized environment with different configurations

I am trying to learn RLlib for training a PPO agent in a vectorized custom gym environment (named MNISTExib-v0) where each environment is instantiated with a different configuration.

I am currently able to train PPO in a single or vectorized environment using the same environment configuration:

def env_creator(env_config):
    return MNISTExib(**env_config)

# Register env
register_env('MNISTExib-v0', env_creator)

# Load a list of environment configurations
with open('env_configs.yaml') as f:
    env_configs = yaml.safe_load(f)

# Configure PPO algorithm
config = (
    get_trainable_cls('PPO')
    .get_default_config()
    .environment(
        'MNISTExib-v0',
        env_config=env_configs[0]
    )
    .env_runners(
        num_env_runners=0,
        num_envs_per_env_runner=1,
    )
    .rl_module(
        model_config=DefaultModelConfig(
            conv_activation="relu",
            head_fcnet_hiddens=[256],
            vf_share_layers=True,
            conv_filters=[(16, 4, 2), (32, 4, 2)],
        )
    )
)

# Build PPO agent
agent = config.build_algo()

# Train PPO agent
train_res = agent.train()

Ideally, I would like to set num_envs_per_env_runner=8 and pass a list env_configs of size 8 to train in parallel on 8 MNISTExib-v0 environments instantiated with different configurations.

Is this someway possible? Or is there any workaround that does not require changes to MNISTExib class?

Thank you!

Hi @Leonardo_Lamanna,

Check out EnvDependingOnWorkerAndVectorIndex in
this section of the documentation.

Thank you @mannyv! I would like to train PPO on a training set of environments by splitting them into batches, I will check it out