How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Hi everyone!
I am trying to implement Multiagent environment in a Server-Client configuration. I don’t understand how to send the observations and rewards to the diferents agents.
In my case I have five agents that I configured in the server side as follow:
config = {
# === Settings for Multi-Agent Environments ===
"multiagent": {
# Map of type MultiAgentPolicyConfigDict from policy ids to tuples
# of (policy_cls, obs_space, act_space, config). This defines the
# observation and action spaces of the policies and any extra config.
"policies": {
"agent_1": PolicySpec(
policy_class = None, # infer automatically from Algorithm
observation_space = spaces.Box(float("-inf"), float("inf"), (9,)),
action_space = spaces.Discrete(10), # son 5 accionables binarios y su combinatoria es 2^5
config={"gamma": 0.99, "lr": 0.001}
),
"agent_2": PolicySpec(
policy_class = None, # infer automatically from Algorithm
observation_space = spaces.Box(float("-inf"), float("inf"), (9,)),
action_space = spaces.Discrete(10), # son 5 accionables binarios y su combinatoria es 2^5
config={"gamma": 0.99, "lr": 0.001}
),
"agent_3": PolicySpec(
policy_class = None, # infer automatically from Algorithm
observation_space = spaces.Box(float("-inf"), float("inf"), (7,)),
action_space = spaces.Discrete(2), # son 5 accionables binarios y su combinatoria es 2^5
config={"gamma": 0.99, "lr": 0.001}
),
"agent_4": PolicySpec(
policy_class = None, # infer automatically from Algorithm
observation_space = spaces.Box(float("-inf"), float("inf"), (7,)),
action_space = spaces.Discrete(2), # son 5 accionables binarios y su combinatoria es 2^5
config={"gamma": 0.99, "lr": 0.001}
),
"agent_5": PolicySpec(
policy_class = None, # infer automatically from Algorithm
observation_space = spaces.Box(float("-inf"), float("inf"), (4,)),
action_space = spaces.Discrete(2), # son 5 accionables binarios y su combinatoria es 2^5
config={"gamma": 0.99, "lr": 0.001}
)
},
'policy_mapping_fn': None,
}
}
in the client side I need to send the observations to ask for actions to the server and log rewards and new observations. How can I do that?
I tryed whit the following configuration to send observations and rewards:
# Get observations of the External Environment for each agent
observations= {
"agent_1": HVAC_H_obs,
"agent_2": HVAC_C_obs,
"agent_3": NW_obs,
"agent_4": SW_obs,
"agent_5": NWB_obs
}
# Ask for actions to the server
action = client.get_action(eid, observations)
# After aplied actions in the environment, rewards are calculated with the new observations
rewards= {
"agent_1": HVAC_H_rew,
"agent_2": HVAC_C_rew,
"agent_3": NW_rew,
"agent_4": SW_rew,
"agent_5": NWB_rew
}
# The rewards are log in the server to lern about it
client.log_returns(eid, rewards, {}, {})
And I have the following error:
-- Raw obs from env: { '1': { 'agent_1': [ 1,
0,
0.0,
0.0,
23.766666666666666,
20.72565539572239,
0.2833333333333333,
178.66666666666666,
53.515461867926604],
'agent_2': [ 1,
0,
0.0,
0.0,
23.766666666666666,
20.72565539572239,
0.2833333333333333,
178.66666666666666,
53.515461867926604],
'agent_3': [ 1,
0,
23.766666666666666,
20.72565539572239,
0.2833333333333333,
178.66666666666666,
53.515461867926604],
'agent_5': [ 1,
0,
23.766666666666666,
20.72565539572239],
'agent_4': [ 1,
0,
23.766666666666666,
20.72565539572239,
0.2833333333333333,
178.66666666666666,
53.515461867926604]}}
2022-07-26 20:10:49,246 INFO sampler.py:665 -- Info return from env: { '1': { 'agent_1': {},
'agent_2': {},
'agent_3': {},
'agent_5': {},
'agent_4': {}}}
--- Logging error ---
Traceback (most recent call last):
File "C:\Users\grhen\AppData\Local\Programs\Python\Python39\lib\site-packages\ray\rllib\env\policy_client.py", line 303, in run
samples = self.rollout_worker.sample()
File "C:\Users\grhen\AppData\Local\Programs\Python\Python39\lib\site-packages\ray\rllib\evaluation\rollout_worker.py", line 825, in sample
batches = [self.input_reader.next()]
File "C:\Users\grhen\AppData\Local\Programs\Python\Python39\lib\site-packages\ray\rllib\evaluation\sampler.py", line 115, in next
batches = [self.get_data()]
File "C:\Users\grhen\AppData\Local\Programs\Python\Python39\lib\site-packages\ray\rllib\evaluation\sampler.py", line 288, in get_data
item = next(self._env_runner)
File "C:\Users\grhen\AppData\Local\Programs\Python\Python39\lib\site-packages\ray\rllib\evaluation\sampler.py", line 671, in _env_runner
active_envs, to_eval, outputs = _process_observations(
File "C:\Users\grhen\AppData\Local\Programs\Python\Python39\lib\site-packages\ray\rllib\evaluation\sampler.py", line 893, in _process_observations
policy_id: PolicyID = episode.policy_for(agent_id)
File "C:\Users\grhen\AppData\Local\Programs\Python\Python39\lib\site-packages\ray\rllib\evaluation\episode.py", line 175, in policy_for
raise KeyError(
KeyError: "policy_mapping_fn returned invalid policy id 'default_policy'!"
I can not to figure out how to Multiagents work in RLlib. Anyone can help me? Thanks!