Multi-Agent cyclic games with paused agents

mannyv · September 26, 2021, 8:55pm

At the pond of the episode I think all you should have to do is return a dictionary with a reward for each agent.

If you look at this code here, especially 774-779 you will see that when the env returns all_done =True, it will create an empty obs for each agent that is not in the final observation.

Personally, I would do it myself in my env so that the semantics were really clear but based on this code it should work fine either way.

@sven1977 or @gjoliver can you confirm?

github.com

ray-project/ray/blob/90d2456ec70270a1f894ec3ef6f3004533859e03/rllib/evaluation/sampler.py#L752-L777

    
      
          if dones[env_id]["__all__"] or episode.length >= horizon:
              hit_horizon = (episode.length >= horizon
                             and not dones[env_id]["__all__"])
              all_agents_done = True
              atari_metrics: List[RolloutMetrics] = _fetch_atari_metrics(
                  base_env)
              if atari_metrics is not None:
                  for m in atari_metrics:
                      outputs.append(
                          m._replace(custom_metrics=episode.custom_metrics))
              else:
                  outputs.append(
                      RolloutMetrics(episode.length, episode.total_reward,
                                     dict(episode.agent_rewards),
                                     episode.custom_metrics, {},
                                     episode.hist_data, episode.media))
              # Check whether we have to create a fake-last observation
              # for some agents (the environment is not required to do so if
              # dones[__all__]=True).
              for ag_id in episode.get_agents():

This file has been truncated. show original

Topic		Replies	Views
MultiAgentEnv Delayed rewards RLlib	2	52	June 3, 2025
How should you end a MultiAgentEnv episode? RLlib	16	1364	October 1, 2022
Different step space for different agents RLlib	7	877	August 11, 2021
Multi-agent Env with different reward functions for different agents? RLlib	6	450	September 14, 2021
How to distribute the final reward among agents in a fully-cooperative turn-taking environmet? RLlib	4	303	October 28, 2021

Multi-Agent cyclic games with paused agents

Related topics