Incompatibility of Differentiable Comms in RL Lib

kia · January 12, 2022, 12:45pm

Hi, I came across this “it is occasionally useful to allow for differentiable communication between agents.
This can allow for efficient modeling of shared computations or communication channels
between agents in the environment. Supporting this feature conflicts with existing RLlib
abstractions for defining policies;” by Eric Liang.
Could you explain why “Supporting this feature conflicts with existing RLlib
abstractions for defining policies” ? Thanks!

sven1977 · January 12, 2022, 2:17pm

Hey @kia , thanks for the question. We have thought about this problem for some time now, sharing models between policies for multi-agent purposes. The key issue is that even though we are able to access the other agents’ batches (environment rollouts, including observations/actions/rewards) inside any agent’s postprocessing/loss function, these data are always static. See for example our centralized_critic.py and centralized_critic_2.py example scripts.

ray/rllib/examples/centralized_critic.py: In this example, the value function network is NOT shared between different policies (“pol1” and “pol2”), but rather each policy uses its own
value network. The “central” aspect here comes from the fact that these
two value networks (from “pol1” and “pol2”) both see all agents’ observations.

ray/rllib/examples/centralized_critic.py: Very similar to above setup: No actually shared model between policies (both policies use their own separate value functions and train these independently).

What you want: One policy having access to the model of another policy (agent) in order to compute gradients through this other policy’s model. RLlib cannot currently do this.

Topic		Replies	Views
A single centralized critic for multiple actor agents RLlib	1	546	July 19, 2021
Achieving individual rewards with agent grouping RLlib	0	272	May 17, 2023
Multi-agent Env with different reward functions for different agents? RLlib	6	402	September 14, 2021
Multi agent partial parameter sharing RLlib	2	390	November 30, 2023
Implementation of CommNet or DIAL in RLLIB RLlib	1	557	November 30, 2022

Incompatibility of Differentiable Comms in RL Lib

Related topics