Custom VectorEnv for efficient parallel environment evaluation

The RLlib documentation on vectorized envs states that

RLLib will auto-vectorize Gym envs for batch evaluation […] you can define a custom environment class that subclasses VectorEnv to implement vector_step() and vector_reset()

I would like to do the latter - i.e. I would like to implement a Gym-style environment that does the parallel processing of multiple environment instances on a lower level (e.g. using numpy or even using torch and a GPU) instead of having multiple parallel environment instances. I think that this has the potential to significantly increase throughput for my application. I am failing to do as instructed though. I inherit from VectorEnv, call the super-constructor with my action and observation space for a single environment as well as the number of envs I am implementing and then implement vector_step and vector_reset. In contrast to what is stated in the docs, things start falling apart already due get_unwrapped not being implemented. Implementing this function is not possible for my use case though, as it would defeat the purpose of implementing vectorization on a low level:

(pid=121)   File "/usr/local/lib/python3.8/dist-packages/ray/rllib/evaluation/rollout_worker.py", line 1133, in stop
(pid=121)     self.async_env.stop()
(pid=121)   File "/usr/local/lib/python3.8/dist-packages/ray/rllib/env/base_env.py", line 221, in stop
(pid=121)     for env in self.get_unwrapped():
(pid=121)   File "/usr/local/lib/python3.8/dist-packages/ray/rllib/env/base_env.py", line 372, in get_unwrapped
(pid=121)     return self.vector_env.get_unwrapped()
(pid=121)   File "/usr/local/lib/python3.8/dist-packages/ray/rllib/env/vector_env.py", line 95, in get_unwrapped
(pid=121)     raise NotImplementedError

What am I missing? Or is this, indeed, not possible?
Thanks!

Hey @janblumenkamp , hmm, yeah, we shouldn’t require get_unwrapped for custom vector envs. I don’t think we have an example for this customization. Let me create one. …

Thanks for the quick response, Sven! I looked a bit into the rollout_worker and get_wrapped only seems to be required for foreach_env, so a quick hack would be to implement it as

    def get_unwrapped(self) -> List[EnvType]:
        return [self]

I will play around a bit more with it, so no urgency.

Yes, I think this is a bug and we should not require implementing it (you are right, it breaks the logic of a truly custom VectorEnv, which may not even have underlying base envs). I’ll fix it. We could create a try block for the case the user calls foreach_env and it’s not implemented.

Or just make it return an empty list by default.

Thanks Sven! To follow up on this, it seems like rendering OpenAI Gym-style is not possible with the VectorEnv since it doesn’t implement the gym.Env interface and the rendering in RLLib uses the gym monitor. Do you have a suggestion about that? I could see it working by inheriting both from gym.Env and VectorEnv and implementing all relevant function (where gym.Env functions use the VectorEnv at a certain index), but that would probably mess up parts in RLLib that explicitly check for the used environment type?

Hey @janblumenkamp . There should be a try_render_at method for vectorized envs that RLlib will call if you set “render_env”=True in your config. It’d be enough to return a numpy RGB image from that method, I think.
Take a look at ray/rllib/examples/env_rendering_and_recording.py.

Yes, having the env be both a VectorEnv and gym.Env may mess up things. :confused:

The current API allows you to:

  • either return an image from this method (try_render_at).
  • or handle rendering (in your own window) yourself → returning None from try_render_at.

In the first case, RLlib will display the returned image in a make-shift window.

Would this help you?