Centralized critic, but decentralized evaluation

kapilPython · November 11, 2021, 10:48am

I have followed the ray/centralized_critic.py at master · ray-project/ray (github.com) to get the centralized critic implementation working for more than two agents.
During evaluation I restore the network and then call compute_single_action() on the trained policy.
Now, since the setup is understood here comes my question, is it possible that we do not invoke the centralized critic model since compute_single_action is independent of the critic network? How do we implement such a behavior?
Inpiration example for the question-
We use 4 similar types of agents in training and while evaluation we use 8 similar agents, the new 4 agents will use the policy ids of the first 4 agents.@sven1977 @ericl

mannyv · November 11, 2021, 1:37pm

Hi @kapilPython,

The actor of the TorchCentralizedModel (self.model) is a decentralized model. This is used by compute_*action. You already have the behavior you seek.

It is the value function that is centralized but that is only called during training either at the end of an episode or all rollouts by postprocess_trajectory or in the loss function.

kapilPython · November 11, 2021, 1:53pm

Hi @mannyv, yeah you are right I was carrying out some experiments to check the validity of what I wanted to do. Yes I confirm the critic and actor are already bifurcated and once deployed will not need other agents observations.

mannyv · November 11, 2021, 2:00pm

@kapilPython,

If you were to use the critic to get the estimated return, it would need the other agents observations but just using the actor policy to get actions only requires a single agent’s observations.

For example, you could not use that model to do decentralized training because you would not have the other agents’ observations or actions which are required for the value function.

Topic		Replies	Views
Multi-Agent with Centralized Critic using an Attention Model Configure Algorithm, Training, Evaluation, Scaling	0	266	January 18, 2024
A single centralized critic for multiple actor agents RLlib	1	577	July 19, 2021
Centralized critic (2) training, descentralized actions in evaluation RLlib	2	312	September 22, 2023
Central Critic from different policies RLlib	2	395	July 19, 2021
How to Implement Decentralized Execution RLlib	3	677	February 18, 2025

Centralized critic, but decentralized evaluation

Related topics