I want to work with ES-like method where the algorithm would calculate the estimated gradient by sampling data with local gaussian noise added to the NN parameters in multiagent environment along side with other algorithms. For example, train PPO agents concurrently with ES agents.
This does not seem possible at the moment. It seems like ES and ARS has its own Worker class that doesn’t support multi-agent environment. Is there a plan to unified the ES worker to the usual RolloutWorker? If not, is there a workaround that allows me to achieve this kind of training?
@rliaw I’ll be happy to make a pull request. However, I’m not sure If I understand all the components that are required to make the RolloutWorker applicable for ES-like algorithm. I would be nice if we have a faster communication channel, as I suspect that I would have a lot of questions along the way. What do you think?