MASAC in RLlib does not natively implement a single centralized critic (plus its twin) shared across all agents for centralized training and decentralized execution. Instead, each policy typically maintains its own critic(s). To implement a truly centralized critic shared by all agents, you would need to customize the model and training logic, as described in the centralized critic examples and documentation. RLlib provides examples and strategies for centralized critics, but MASAC does not provide this out-of-the-box for all agents as a single shared network.
Would you like more detail on how to implement a single centralized critic in RLlib?
Sources:
- https://discuss.ray.io/t/centralized-critic-but-decentralized-evaluation/4100
- https://github.com/ray-project/ray/blob/master/rllib/examples/_old_api_stack/models/centralized_critic_models.py
Hint: Mention @RunLLM in the post for followups.