Is there a way that allows derivative-free methods be use with multi-agent?

51616 · April 28, 2021, 3:34pm

I want to work with ES-like method where the algorithm would calculate the estimated gradient by sampling data with local gaussian noise added to the NN parameters in multiagent environment along side with other algorithms. For example, train PPO agents concurrently with ES agents.

This does not seem possible at the moment. It seems like ES and ARS has its own Worker class that doesn’t support multi-agent environment. Is there a plan to unified the ES worker to the usual RolloutWorker? If not, is there a workaround that allows me to achieve this kind of training?

rliaw · April 29, 2021, 3:06am

Currently I think this is not a prioritized issue.

Unfortunately RLlib is a bit limited in development bandwidth right now. Would you be interested in making a pull-request?

This should actually be a well-scoped task.

51616 · April 29, 2021, 4:39am

@rliaw I’ll be happy to make a pull request. However, I’m not sure If I understand all the components that are required to make the RolloutWorker applicable for ES-like algorithm. I would be nice if we have a faster communication channel, as I suspect that I would have a lot of questions along the way. What do you think?

rliaw · April 29, 2021, 4:50am

Sure! Join the ray slack and ping #rllib ?

51616 · April 29, 2021, 5:44am

@rliaw Cool. Thanks!

Topic		Replies	Views
Multi-Agent Transformer RLlib	5	1201	September 21, 2022
Multi-agent Supply Chain Optimization with RLlib RLlib	1	299	June 19, 2024
Multi agent Policy, selector agent RLlib	0	218	May 9, 2023
Multi-Agent Training with Different Algorithms RLlib	24	3484	October 11, 2022
Help with ppo config in multiagent env with complex observations Configure Algorithm, Training, Evaluation, Scaling	0	41	April 11, 2025

Is there a way that allows derivative-free methods be use with multi-agent?

Related topics