We are interested in varying the number of agents in multi-agent environments over time. Is there a way to do it without stopping the trainer and policy client ?
For example, we are registering the environment in the trainer like this: register_env("multi_agent_cartpole", lambda _: MultiAgentCartPole({"num_agents": 4}))
Here, during the start of trainer, I specified 4 agents. Let us say, after few training steps, I want to create a new agent and delete some agent. How can we do this dynamically without stopping the trainer and worker ?
Hey @Sertingolix , there is also a simple curriculum API, explained in this example script here that allows you to change your env in-flight (set it to a new “task”):
ray.rllib.examples.curriculum_learning.py
Your env will have to implement set_task and get_task and you need to specify a env_task_fn in your config (takes train results and env object so you can check whether you should set the env’s task to a new value).
Thats right. Personally I use the curriculum API too. For custom envs this works perfect. For benchmark envs I think one should wrap them for comparison reasons.