PolicyClient and QMix + MultiAgentEnv?

ravlur · August 17, 2023, 4:15am

How severe does this issue affect your experience of using Ray?

High: It blocks me to complete my task.

I have an RL experiment that uses the QMix algorithm on a MultiAgentEnv. However, it is pretty slow due to some compute-intensive sub-processes so I’m looking to use the Client-Server architecture described here (Environments — Ray 3.0.0.dev0) to improve episodes/min.

Building on the simple Cartpole scripts example of client-server architecture, I’ve extended it to work with a toy MultiAgentEnv environment.

Here are the client-side parameters:

    grouping = {
        "group_1": [0],
    }
    obs_space = Tuple(
        [
            Box(float("-inf"), float("inf"), (4,))
    ]
    )
    act_space = Tuple(
        [
            Discrete(2)
        ]
    )
    env = TestEnv().with_agent_groups(grouping, obs_space=obs_space, act_space=act_space)

So, it’s a simple single agent, single group MultiAgentEnv.

The server-side parameters are the following:

    obs_space = Tuple(
        [
            Box(float("-inf"), float("inf"), (4,))
    ]
    )
    act_space = Tuple(
        [
            Discrete(2)
        ]
    )

    config = (
        ...
        .environment(
            env=None,
            observation_space=obs_space,
            action_space=act_space,
        )
       ...
)

When I run this, I get the following error:

ValueError: The two structures don't have the same nested structure.

First structure: type=tuple str=({'group_1': [array([ 0.03581125, -0.00619089,  0.04431266,  0.03354166], dtype=float32)]},)

Second structure: type=tuple str=(array([ 0.8475069 ,  0.6639858 , -0.13116916,  1.0029598 ], dtype=float32),)

More specifically: Substructure "type=dict str={'group_1': [array([ 0.03581125, -0.00619089,  0.04431266,  0.03354166], dtype=float32)]}" is a sequence,
while substructure "type=ndarray str=[ 0.8475069   0.6639858  -0.13116916  1.0029598 ]" is not
Entire first structure:
({'group_1': [.]},)
Entire second structure:
(.,)

I know there is an ExternalMultiAgentEnv but I’d like to use PolicyClient at the moment. Does PolicyClient not work with MultiAgentEnv?

I’d appreciate any help. Thanks in advance.

Cheers!

ravlur · August 17, 2023, 5:16am

I could figure it out. I’ll put it down here for future reference for others.

Your env class must inherit ExternalMultiAgentEnv & MultiAgentEnv.
You need to add the following to your QMixConfig

    multiagent_config = {
    "policies": {
        "main": (None, obs_space, act_space, {})
    },
    "policy_mapping_fn": lambda agent_id: "main"
    }

config(
        ...
        .multi_agent(**multiagent_config)
        ...
)

where the lambda function controls the policy used by each agent. For shared policy, the above works.

Hope this helps someone!

Topic		Replies	Views
ExternalMultiAgentEnv and QMIX for remote inference over HTTP with multiple clients RLlib	6	1354	October 15, 2021
External env for multiagents RLlib	4	18	July 9, 2025
Issues with MultiAgentEnv RLlib	1	362	September 7, 2023
How to share obsrvations and rewards in Multi-Agent ExternallEnv? RLlib	2	430	July 27, 2022
QMix Grouping Agents in ExternalEnv Configuration Configure Algorithm, Training, Evaluation, Scaling	0	476	March 9, 2023

PolicyClient and QMix + MultiAgentEnv?

Related topics