A single centralized critic for multiple actor agents

I wonder if this is possible using RLlib because, as I understand, in centralized critic example (ray/centralized_critic_2.py at master · ray-project/ray · GitHub) each of the agent has its own centralized critic.

Can I implement a custom algorithm that has a single centralized critic that gives a value for all agents given current state and all agents’ action? What options do I have if I want to implement this?

1 Like

Have you had a look at variable sharing between policies?

With ModelV2, you can put layers in global variables and straightforwardly share those layer objects between models instead of using variable scopes.