I’m trying to configurate a QMix algorithm in RLlib 2.3.0.
Previously I configured a DQN algorithm and worked good, but now I need to my agents cooperate between them. I read that QMix is a good alternative for a centralized training and a descentralized
get_action(). Can you confirm that? There are a better alternative?
But my question here is the follow: How can I configure a grouping of agents in a Server/Client (or ExternalEnv) configuration?
In another post it was tell that you need to define your environment, but in the configuration that I need I have not an environment, as the case in this post, where was suggest that the grouping must be in the PolicyClient side. I don’t really understad that final suggestion. There are an example of QMix algorithm with None environment?