Is there a way that allows derivative-free methods be use with multi-agent?

I want to work with ES-like method where the algorithm would calculate the estimated gradient by sampling data with local gaussian noise added to the NN parameters in multiagent environment along side with other algorithms. For example, train PPO agents concurrently with ES agents.

This does not seem possible at the moment. It seems like ES and ARS has its own Worker class that doesn’t support multi-agent environment. Is there a plan to unified the ES worker to the usual RolloutWorker? If not, is there a workaround that allows me to achieve this kind of training?

1 Like

Currently I think this is not a prioritized issue.

Unfortunately RLlib is a bit limited in development bandwidth right now. Would you be interested in making a pull-request?

This should actually be a well-scoped task.

1 Like

@rliaw I’ll be happy to make a pull request. However, I’m not sure If I understand all the components that are required to make the RolloutWorker applicable for ES-like algorithm. I would be nice if we have a faster communication channel, as I suspect that I would have a lot of questions along the way. What do you think?

Sure! Join the ray slack and ping #rllib :slight_smile: ?

@rliaw Cool. Thanks!