Different step space for different agents

UserRLlib · July 29, 2021, 8:25am

Hi @sven1977 and @mannyv

Thanks for the reply, and thanks very much for the great work! After updating the ray to the latest version 1.5, RLlib doesn’t drop the error about mismatching in the obs and reward anymore. And I think it works well now. Cheers!

By the way, I failed to understand the second point you mentioned @sven1977 , “RLlib will sum up rewards for an agent if the agent does not have an observation accompanying the reward.”

For example, in my env, I got two agents and I design the env like this
step 0: act: agent1 → obs: agent2, reward: agent1
step 1: act: agent2 → obs: agent1, reward: agent2
step 2: act: agent1 → obs: agent2, reward: agent1
…

The obs and reward always do not match. Does this mean the policy will sum up rewards for each agent and wait until the end of the env to learn? Thanks in advance for your reply!

Topic		Replies	Views
Initialization of multiagent envs RLlib	8	478	August 31, 2022
Mutiagent - Different action space for different agents RLlib	8	1810	August 25, 2022
Multi-agent setting different step sizes for agents and how actions are passed? RLlib	2	610	April 26, 2022
MultiAgents type actions/observation space defined in environement RLlib	8	1386	May 10, 2022
Obervation space and action space in multi-agent env RLlib	3	400	August 14, 2021

Different step space for different agents

Related topics