Hi,
I’ve been using PettingZoo for my MARL research, and I’m evaluating whether RLlib can make things easier for me.
I’m working on a MARL experiment where each of the agent has its own separate neural network, i.e. a separate set of weights. In other words, the different agents learn by playing together, but each of them learns separately, and possibly develops behavior that’s different from that of the other agents.
Is that something that’s easily possible? Is there an example online that does this that I could take a look at?
Thanks for your help,
Ram Rachum.
I believe you just need to assign a different policy to each of the agents:
policies = {
'policy_1': (None, obs_space_1, action_space_2, {}),
'policy_2': (None, obs_space_2, action_space_2, {}),
...
}
def policy_mapping_fn(agent_id):
if agent_id == 'agent_1':
return 'policy_1'
elif agent_id == 'agent2':
return 'policy_2'
...
This will allow the different agents in your simulation to be controlled by different policies, and their specific rollout fragments will be used to train their policies. Here a full example using Abmarl.
I am also a MARL researcher, and I have been using RLlib for the past three years, and it has made my work significantly easier. It is designed to handle multi-agent simulations, so most of what you want to do is available. Please feel free to reach out directly if you have any MARL questions
Thank you so much Edward! I’ll check it out.