Setting global info state in Multi-Agent step function

Hi everyone!

Another question from my side. Consider the following situation:

You create a Multi-Agent environment with N agents which you want to train with the RLLib Trainer framework. The env changes over time, but this change is global, i.e. it is same for each of the agents (e.g. a changing wind direction for a drone scenario). Now, if you want to retrieve this global env state from a saved episode, you need to include it in the outputs of the environment step function. This makes total sense, and I have previously included these global states inside the info dict in the single-agent case. However, if this need arises for the multi-agent case, we have to decide:

  • Append the global state to all agents? Sub-optimal and problematic for large global info (e.g. images)
  • Append it to only one agent? This would be overall dirty, as the state is global. If we would exclude the specific agent from the env, the global state would be lost in the logs.

These are all the options I have found so far, as the info dict only allows policy keys. If there is any other way, please let me know.

In case there is no better way, I would suggest to allow a __global__ key (or __all__ to prevent confusion) similar to the done signals. In the best case, this key would be optional and log the infos just like an agent does, such that they are easy to retrieve from the output file.

Thanks and let me know what you think!