GroupAgentsWrapper is interfering with metric tracking

aadharna · July 24, 2022, 10:12pm

So, I’m using QMIX and part of setting up qmix is using the environment created by MultiAgentEnv.with_grouped_envs(group, act_space, obs_space) where act_space and obs_space are Tuple-Spaces.

Two things, when adapting from the two_step_game.py example file, we setup the group structure as group_id = {'agents': ['1', '2', '3', '4']} and then set self._agent_ids = {"agents"}.

This means that in the policy_mapping_fn, the agent_id that gets passed in is agents rather than an ID like e.g., 1, 2, 3, … .

    def policy_mapping_fn(agent_id, episode, **kwargs):
        if agent_id == 0:
            return '0'
        elif agent_id == 1:
            return '1'
        elif agent_id == 2:
            return '2'
        else:
            return '3'

This method usually works when we don’t have the grouped agents and all the data I want for each agent is tracked properly.

I am currently using a custom model, so in the config, I need to set it up as:

config["multiagent"] = {
        'policies': {
            "0": (None,
                  env.observation_space,
                  env.action_space,
                  {'model':{'custom_model': 'SimpleConv',
                   'custom_model_config': {}}
                   }),
            "1": (None,
                  env.observation_space,
                  env.action_space,
                  {'model':{'custom_model': 'SimpleConv',
                   'custom_model_config': {}}
                   }),
            "2": (None,
                  env.observation_space,
                  env.action_space,
                  {'model':{'custom_model': 'SimpleConv',
                   'custom_model_config': {}}
                   }),
            "3": (None,
                  env.observation_space,
                  env.action_space,
                  {'model':{'custom_model': 'SimpleConv',
                   'custom_model_config': {}}
                   }),
        },
        "policy_mapping_fn": policy_mapping_fn,
    }

However, now in the mapping, we are getting agents passed in to the mapping function and not the actual policy IDs therefore, the metrics are not being tracked for all of the agents.

What is the appropriate way of setting up the policy mapping function while using the grouped environment setting like is required for QMIX?

I am currently trying out breaking apart the groupings with something like:
g1 = {f'group{k - 1}': [k - 1] for k in env.get_agent_ids()} but that results in the following error:

...
File "C:\Users\Roque\AppData\Roaming\Python\Python38\site-packages\ray\rllib\execution\rollout_ops.py", line 99, in synchronous_parallel_sample
    sample_batches = ray.get(
  File "C:\Users\Roque\AppData\Roaming\Python\Python38\site-packages\ray\_private\client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\Roque\AppData\Roaming\Python\Python38\site-packages\ray\worker.py", line 1831, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(ValueError): e[36mray::RolloutWorker.sample()e[39m (pid=39368, ip=127.0.0.1, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x000002111056DAC0>)
ValueError: The two structures don't have the same nested structure.
...
Entire first structure:
[.]
Entire second structure:
(., ., ., .)

Any advice on how to use QMIX with custom models and how to properly setup the policy mapping function would be greatly appreciated.

Topic		Replies	Views
QMIX returns empty action Configure Algorithm, Training, Evaluation, Scaling	0	271	May 23, 2023
MultiAgents type actions/observation space defined in environement RLlib	8	1374	May 10, 2022
PolicyClient and QMix + MultiAgentEnv? RLlib	1	203	August 17, 2023
ExternalMultiAgentEnv and QMIX for remote inference over HTTP with multiple clients RLlib	6	1352	October 15, 2021
Different observation space in MultiAgentEnv RLlib	2	735	August 12, 2021

GroupAgentsWrapper is interfering with metric tracking

Related topics