How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
I have an RL experiment that uses the QMix algorithm on a MultiAgentEnv. However, it is pretty slow due to some compute-intensive sub-processes so I’m looking to use the Client-Server architecture described here (Environments — Ray 3.0.0.dev0) to improve episodes/min.
Building on the simple Cartpole scripts example of client-server architecture, I’ve extended it to work with a toy MultiAgentEnv environment.
Here are the client-side parameters:
grouping = {
"group_1": [0],
}
obs_space = Tuple(
[
Box(float("-inf"), float("inf"), (4,))
]
)
act_space = Tuple(
[
Discrete(2)
]
)
env = TestEnv().with_agent_groups(grouping, obs_space=obs_space, act_space=act_space)
So, it’s a simple single agent, single group MultiAgentEnv.
The server-side parameters are the following:
obs_space = Tuple(
[
Box(float("-inf"), float("inf"), (4,))
]
)
act_space = Tuple(
[
Discrete(2)
]
)
config = (
...
.environment(
env=None,
observation_space=obs_space,
action_space=act_space,
)
...
)
When I run this, I get the following error:
ValueError: The two structures don't have the same nested structure.
First structure: type=tuple str=({'group_1': [array([ 0.03581125, -0.00619089, 0.04431266, 0.03354166], dtype=float32)]},)
Second structure: type=tuple str=(array([ 0.8475069 , 0.6639858 , -0.13116916, 1.0029598 ], dtype=float32),)
More specifically: Substructure "type=dict str={'group_1': [array([ 0.03581125, -0.00619089, 0.04431266, 0.03354166], dtype=float32)]}" is a sequence,
while substructure "type=ndarray str=[ 0.8475069 0.6639858 -0.13116916 1.0029598 ]" is not
Entire first structure:
({'group_1': [.]},)
Entire second structure:
(.,)
I know there is an ExternalMultiAgentEnv but I’d like to use PolicyClient at the moment. Does PolicyClient not work with MultiAgentEnv?
I’d appreciate any help. Thanks in advance.
Cheers!