Trying to set up external RL environment and having trouble

I’m working on a project that is sort of similar to this, although I didn’t end up using the ExternalEnv API so I’m not an expert. The way to implement this if you want to use an ExternalEnv is to use a PolicyClient and PolicyServerInput. This (RLlib Environments — Ray v1.6.0) points to a simple example that uses them.

(The server script: ray/cartpole_server.py at master · ray-project/ray · GitHub and the client script: ray/cartpole_client.py at master · ray-project/ray · GitHub)

The basic idea here is that you need to run the two scripts at the same time, which in your case translates to having to run a script that trains your RL agent, and a script that runs your tune experiments for which you want to optimize the hyperparams. The server script from the example can mostly be left as-is, the client script is where you need to make most of the changes.

You want to make the client script run your hypertuning code, and you’ll want to make a custom scheduler like you said. The PolicyClient needs to be wrapped by that scheduler, so you can report the hypertuning results to the server script, and also so you can easily poll your agent for actions (i.e. new hyperparams to train on)

There are quite a few moving parts in this setup so I might have missed something, but I hope that helps. If not, feel free post a follow up :slight_smile:

2 Likes