I want to work with ES-like method where the algorithm would calculate the estimated gradient by sampling data with local gaussian noise added to the NN parameters in multiagent environment along side with other algorithms. For example, train PPO agents concurrently with ES agents.
This does not seem possible at the moment. It seems like ES and ARS has its own
Worker class that doesn’t support multi-agent environment. Is there a plan to unified the ES worker to the usual
RolloutWorker? If not, is there a workaround that allows me to achieve this kind of training?