Accessing other agents' rewards and actions in ppo loss for multi agent environment

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hi All! I have a multi agent environment. Currently, each agent is being trained via PPO. They only receive the reward, observation, and action they have token in the loss function. However, I am trying to change the PPO to another algorithm I am researching on so it also take into account the rewards and actions of others. But, I don’t know how I can access it or even if this is possible at RLlib.