Get agent ID in multi-agent setting

lucas_spangher · September 15, 2021, 5:19pm

Sorry, this is probably a very basic question. I’m not sure where to retrieve the agent_ids for the multiple agents created in a multiagent setting, so I can map the policy functions on. Can anyone please point me there?

lucas_spangher · September 15, 2021, 6:30pm

Or set the agent_id, if that is easier

mannyv · September 15, 2021, 11:39pm

Hi @lucas_spangher

The agent_ids are provided by the environment. They are often strings but they need not be. They can be any hashable type. These ids will be the keys in the dictionaries the environment returns from calls to reset and step.

You need to figure what those ids are then write a function that given an agent id returns a key for the appropriate policy that you specified in your multiagent config for rllib.

lucas_spangher · September 15, 2021, 11:53pm

Ah yes, this is what my question was trying to ask. How in environments are the agent ID’s set?

I didn’t catch this in the examples in the rllib library. I can look through the examples again, as I was looking more in agent configs and such, but, if you have a ready example you can point me to, I’d love to see. Thank you very much!

mannyv · September 16, 2021, 12:06am

Are you making your own multiagent env or are you using a pre-made one?

lucas_spangher · September 16, 2021, 1:14am

I’m converting a single agent env to multiagent. So I guess I would need to set up the naming conventions?

The environment, CounterfactualMicrogridRLlib, is at the bottom of this file, if you’re curious to see:

https://github.com/Aphoh/temp_tc/blob/lucas_MACRL/gym-microgrid/gym_microgrid/envs/microgrid_env.py

mannyv · September 16, 2021, 1:24am

Yes it is totally up to you. Rllib gives no intrinsic meaning to agent ids. They are arbitrary but they do need to be unique for every agent in the environment. You could name them 0…n or red, green, blue,…, or agent_0, agent_1,…,agent_n.

If you are going to assign didn’t policies to different types of agents then strings is a good way to go because it makes the policy_mapping_function easy to write. This would be like car_0, car_1, truck _0, bike_0, bike_1, bike_2,…

lucas_spangher · September 16, 2021, 1:47am

But, sorry, this is probably sounding completely dumb, but how and where in a multiagent env are individual agents processed and assigned names? I don’t see any function in any multiagent inits that are aware of individual agents, just policies.

mannyv · September 16, 2021, 2:01am

You make them up in the environment you are writing.

This example holds a list of agents. In reset and step it gives them an int agent id from 0…n. i is the agent id and a.reset() is providing the observation.

github.com

ray-project/ray/blob/ed04ab71401874db8b2f4f91ca5737919259b950/rllib/examples/env/multi_agent.py#L31

    
      
          def __init__(self, num):
              self.agents = [MockEnv(25) for _ in range(num)]
              self.dones = set()
              self.observation_space = gym.spaces.Discrete(2)
              self.action_space = gym.spaces.Discrete(2)
              self.resetted = False
          
          
def reset(self):
              self.resetted = True
              self.dones = set()
              return {i: a.reset() for i, a in enumerate(self.agents)}
          
          
def step(self, action_dict):
              obs, rew, done, info = {}, {}, {}, {}
              for i, action in action_dict.items():
                  obs[i], rew[i], done[i], info[i] = self.agents[i].step(action)
                  if done[i]:
                      self.dones.add(i)
              done["__all__"] = len(self.dones) == len(self.agents)
              return obs, rew, done, info

Agent ids are not specified anywhere in the RLlib config. They come from the environment.

lucas_spangher · September 16, 2021, 2:48am

Thanks for bearing with me. I appreciate it.

lucas_spangher · September 17, 2021, 11:51pm

Hey Manny,

I should have been more clear. The example you posted is part of the source of my confusion. I don’t see anywhere that anything like an agent_id is set when agents are created. In the example, self.agents are an unnamed list of identical environments.

Am I correct in understanding then that this example deals with unnamed agents, that agent_id isn’t strictly necessary for multi agent env to work, and you should only set it yourself if you need to assign policies based on it?

If this is the case, perhaps it would be helpful to have an environment similar to the MultiAgentTraffic environment that is on the tutorial of multiagent envs, because the code snippets from that made it seem like the agents required naming generally.

mannyv · September 18, 2021, 2:16am

@lucas_spangher

I agree that the example I showed is a bit out of the ordinary but it was the only one I could find easily in the examples.

The “agent_id” is the key used to access an agent in the environments value in the dictionary returned by reset or step.

Let’s say we have an environment with 3 agents and their observation space is a Discrete(1).

We call env.reset() and get the following result.
{0:[12],“a”:[1],(1,2):[9]}

First we should fire the developer who wrote that environment.

This environment currently has 3 agents. The agent_ids are 0, “a”, and (1,2).

Internally there is no “name” for the agent. The ID is just the dictionary keys used to access information about an environment. Yes agent_ids are required because multiagent envs must return dictionaries and non-empty dictionaries must have keys.

mannyv · September 18, 2021, 2:21am

@lucas_spangher

Here is another emvironment that may make more sense?

github.com

wsjeon/maddpg-rllib/blob/master/env/multiagent_particle_env.py

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from gym.spaces import Discrete, Box, MultiDiscrete
from ray import rllib
from make_env import make_env

import numpy as np
import time


class RLlibMultiAgentParticleEnv(rllib.MultiAgentEnv):
    """Wraps OpenAI Multi-Agent Particle env to be compatible with RLLib multi-agent."""

    def __init__(self, **mpe_args):
        """Create a new Multi-Agent Particle env compatible with RLlib.

        Arguments:
            mpe_args (dict): Arguments to pass to the underlying

This file has been truncated. show original

rusu24edward · September 22, 2021, 5:09am

I’ve created a multiagent framework that organizes how agents are stored in an environment. You may find it helpful. Design — Abmarl 0.1.3 documentation

lucas_spangher · September 22, 2021, 8:03am

Hey Manny!!

Thanks for following up! I’ve been putting some time into this about 2-3 days a week, so apologies for the delayed responses.

SO I’ve looked at in with my team… I realized that one area of oversight was the initial tutorial for multiagent env: RLlib Environments — Ray v1.6.0 which in the first block has us calling the reset() function to get a dictionary with “car_1”, “car_2”, “traffic_light_1”. These are agents, of which only three are running that particular turn (even though the init in the tutorial initializes 20 cars and 5 traffic lights, only three are running that turn.)

My confusion was that in the next code block, we are assigning policies for “car1”, “car2”, and “traffic_light”. These are different than the agents that we are retrieving above… they are policies! There are two car policies. So I thought that somehow the two were and had to be related… that we needed agent_ids when creating the policies.

NOW my understanding is that the action_dict in step() is filled by the values that are returned by reset(), and will correspond to each agent. The policies are created at the beginning and will train over time.

Is that correct?

One further thing I noticed through accidentally leaving a print statement in my policy_mapping_fn() is that it is basically called each step. First I thought this was a bug, but now I understand that you may want to dynamically map the policies to each set of active agents each turn. Is that correct?

 def policy_mapping_fn(agent_id, **kwargs):
      print("agent_id---------------------")
      print(agent_id)
      pol_id = agent_id
      return pol_id

BTW – my environment will have an equal number of agents and policies, and maintain the mapping throughout the training. So I think I have it right now, but that was a bit difficult.

Also, BTW , @rusu24edward , thanks! I think I’m going to avoid overhauling my codebase as it is based on RLLib currently, unless your framework can fit into it?

rusu24edward · September 22, 2021, 2:56pm

You are right, the agents with observations returned in one step will be the ones that provide actions in the next. I too have learned this from experience, and it would be nice if the tutorials made this explicitly clear for those who design environments from scratch.

The framework that I linked to integrates with Rllib. It provides a couple of neat features: workflow scripts, config files, and the environment interface I define not only makes the agent organization explicit, but it also better separates the part of the simulation that updates the state and the part that returns information to agents. No overhaul needed I designed it by going through the same pains you are, so you may be able to accelerate your development by using it. I’d be happy to provide more guidance if you want to use it.

rfali · October 5, 2021, 7:53am

@lucas_spangher
Have you had a chance to look at multi_agent_cartpole.py, where I think the agent_ID is set here.

I was using PettingZoo and I also needed to access agent_ids, but the agent_IDs in that environment are strings (see here), so I had to do a dict mapping like following

def policy_mapping_fn(agent_id, **kwargs):
        agent_dict = {'first_0': 0, 'second_0':1, 'third_0':2, 'fourth_0':3}
    
        if agent_dict[agent_id] % 2 == 0:
            return "dqn_policy"    # Even numbered agents 0,2,4...
        else:
            return "ppo_policy"     # Odd numbered agents 1,3,5...

Topic		Replies	Views
Policy mapping for computing actions in multi agent env RLlib	8	1204	January 2, 2022
I'm confused about how policy mapping works in configuration RLlib	5	2486	July 29, 2022
Agent_ids that are not the names of the agents in the env RLlib	3	998	July 27, 2022
Agent_key and policy_id mismatch on multiagent ensemble training RLlib	9	912	March 30, 2021
Vectorized multi-agent setup RLlib	3	419	February 12, 2021

Get agent ID in multi-agent setting

Related topics