I'm confused about how policy mapping works in configuration

mannyv · July 28, 2022, 3:09pm

The names of the agents are defined in the environment you provide and are included as keys in the data provided by reset and step.

In RLLIB algorithms there are policies that make the action decisions given observation from the environment. These algorithms are optimized with an RL algorithm during training.

In the RLLIB config you need to define the policies you want to use to make action decisions. If you don’t specify any a single policy called “default_policy” will be created.

You also need to create a policy mapping function that maps agent ids to policy ids. Unless you are using the default_policy in which case you do not need to provide this mapping because they are all mapped to one policy.

Now here is the part I think you are confused by. There is no formal specification of the agent_ids provided during configuration. That is implicit information in the environment that you need to know ahead of time or write some methods in your environment to retrieve them. The member _agent_ids is an attempt to remedy that implicit knowledge but it is an RLLIB convention and most environments do not have that.

You do not necessarily need to know the exact agent names ahead of time if they are named according to some convention. For example perhaps you have an environment that has car agents (whose names are formatted like car_0, car_1, car_2,… ) and bicycles (bike_0, bike_1, …) and you have two policies one for cars (car_policy) and one for bicycles (bike_policy). You could write a policy mapping function like this:

def agent_to_policy_map(agent_id):
    if agent_id.startwith("car"):
        return "car"
    elif agent_id.startswith("bike"):
        return "bike"
    else:
        raise ValueError("Unknown agent type: ", agent_id)

Topic		Replies	Views
Policy mapping for computing actions in multi agent env RLlib	8	1190	January 2, 2022
Two different method mapping policy to agents RLlib	1	278	February 2, 2023
Failing at configuring a multi-agent trainer RLlib	0	39	December 20, 2024
Why are policies randomly assigned to agent_ids in the mapping function? RLlib	2	322	April 24, 2021
Get agent ID in multi-agent setting RLlib	16	1660	October 5, 2021

I'm confused about how policy mapping works in configuration

Related topics