Distributed APPO With Flexible Number of Workers and Custom Environment

Denys_Ashikhin · December 17, 2020, 6:34pm

Hi,

A friend and I are working on creating RL algorithm (currently looking at LSTM/PPO) to play a specific game, let’s call it a card game. However, since the game is external we have created a way to interact with the game using simulated mouse movements/clicks and some image recognition to identify some key values for the current game state, the rest is calculated and stored internally.

These game can take a while, so we were hoping to be able to spin up VM’s on our computers and depending on the time of day, say 2pm vs 3am, have various number of VM’s actively playing the game and being trained on my computer which will be running 24/7. As a sidenote, due to having to control mouse movements, we can only have one player/environment per VM.

There is a lot to unpack so I would like to get some guided opinions on the best way to approach this, from what I’ve seen, Ray got everything necessary for this to work, what we will need to do:

Wrap our current game interactivity in a custom environment, and implement the methods as in this example: https://github.com/ray-project/ray/blob/master/rllib/examples/env/parametric_actions_cartpole.py
Create a custom evaluation function for our env as per: https://github.com/ray-project/ray/blob/82f9c7014e2d0acd3e3869066f5dc3142ec9e7a7/rllib/agents/trainer.py#L730

After that is done, I’d need to setup some distributed Logic (initial sites set on APPO). However, I’m not sure if this is the best approach and which examples or documentation pages would be most applicable to my use case.

So if anyone could confirm whether #1 and #2 are enough to port our current code to work with RLlib, and how to pair that with the distributed part we would be very thankful!

Topic		Replies	Views
How to run multiple trainers? RLlib	2	336	August 26, 2022
Not Sure Which RLlib Algorithm To Use RLlib	5	642	April 27, 2021
Expanding RLlib learning environment with multiple simulators and machines while reducing communication overhead Configure Algorithm, Training, Evaluation, Scaling	1	429	June 23, 2023
How to use rllib to conduct distributed training on multiple machines at the same time Configure Algorithm, Training, Evaluation, Scaling	5	757	February 20, 2023
How to use an environment that runs outside Python with RLlib? RLlib	1	421	February 1, 2021

Distributed APPO With Flexible Number of Workers and Custom Environment

Related topics