[RLlib] varying the number of agents in multi-agent environments

sai · June 9, 2021, 7:21am

We are interested in varying the number of agents in multi-agent environments over time. Is there a way to do it without stopping the trainer and policy client ?

For example, we are registering the environment in the trainer like this:
register_env("multi_agent_cartpole", lambda _: MultiAgentCartPole({"num_agents": 4}))

Here, during the start of trainer, I specified 4 agents. Let us say, after few training steps, I want to create a new agent and delete some agent. How can we do this dynamically without stopping the trainer and worker ?

Thanks
Sai

Sertingolix · June 10, 2021, 2:04pm

Basically you want to do curriculum-learning with ray. I guess the simplest way is to change the environment after some number of episodes.

import ray
from ray import tune

def on_train_result(info):
    result = info["result"]
    if result["episode_reward_mean"] > 200:
        task = 2
    elif result["episode_reward_mean"] > 100:
        task = 1
    else:
        task = 0
    trainer = info["trainer"]
    trainer.workers.foreach_worker(
        lambda ev: ev.foreach_env(
            lambda env: env.set_task(task)))

ray.init()
tune.run(
    "PPO",
    config={
        "env": YourEnv,
        "callbacks": {
            "on_train_result": on_train_result,
        },
    },
)

Otherwise you cold use an environment wrapper to update the task. But this depends on the effort you want to put in your project.

sven1977 · June 11, 2021, 7:31am

Hey @Sertingolix , there is also a simple curriculum API, explained in this example script here that allows you to change your env in-flight (set it to a new “task”):

ray.rllib.examples.curriculum_learning.py

Your env will have to implement set_task and get_task and you need to specify a env_task_fn in your config (takes train results and env object so you can check whether you should set the env’s task to a new value).

Sertingolix · June 11, 2021, 8:05am

Thats right. Personally I use the curriculum API too. For custom envs this works perfect. For benchmark envs I think one should wrap them for comparison reasons.

Topic		Replies	Views
[rllib] Modify multi agent env reward mid training RLlib	7	1247	May 27, 2021
How can I train multiple 'trainer' in same environment?(or embed trained trainer in environment?) RLlib	3	461	January 9, 2023
Training on multiple environment Offline RL	2	826	February 14, 2023
How to run multiple trainers? RLlib	2	314	August 26, 2022
Question about Environment/Observation construction RLlib	1	375	June 17, 2021

[RLlib] varying the number of agents in multi-agent environments

Related topics