I’m trying to utilize Ray’s distributed capabilities with RLLib. I’m using kubernetes for orchestration, and from what I’ve read, Ray creates a mini cluster underneath a kubernetes cluster within a single namespace.
My sim is actually external to my Gym Env, and I would like to keep it that way. In my step function to acquire the new state, I reach out to my sim via RPC calls. Ideally I’d have 3 services running: Gym Env, Sim, and an adapter between the env and the sim. I’m well acquainted with kubernetes, so my instinct is to create a kubernetes yaml for the orchestration between all of them, but with ray.init() and ray.tune(), I’m not sure how I can spin up multiple services when my environment spins up.
I feel like Ray should be able to handle that. Do I need to create a threaded Actor? I’ve found good information on RLLib and also Ray Core’s remote processing, but combining the two is giving me some trouble. Any advice will help!
The most favourable solution is one where you get external RPCs out of the picture. If that is not possible and your simulator is a clunky piece of software that you have to start by hand and then waits for incoming commands (such that it cannot be properly handled as a service by your cluster), you can also try to introduce a custom resource with ray. The custom resource would be symbolic for a 1-1 mapping from rollout workers to your remote environments. Then create your own gym-style env that requires your custom resource and have your rollout workers communicate to individual instances of your simulation.
Does this help? I am sorry there is not out of the box super easy solution to this. This might require a lot of legwork.