Hi all,
This may be a very simple question, but I am having a hard time understanding this. I am following along with the Two-Step Game environment for multi-agent reinforcement learning. I am training with a Centralized Critic with PPO (see: https://github.com/ray-project/ray/blob/master/rllib/examples/centralized_critic.py)
After I train the model, how do I deploy it for execution only? In the same vein, I have also looked at using QMIX (see: https://github.com/ray-project/ray/blob/master/rllib/examples/two_step_game.py) where the agents are grouped during training. Should I ungroup them during execution?
Thanks for the help!