Integrate Custom Vectorized Environment with RLlib

Krimo · July 21, 2025, 9:55pm

Hello,

I want to implement a custom vectorized environment. I’m aware that RLlib handles the vectorization automatically when you set num_envs_per_worker > 1 by creating multiple environment copies, but for my use case, I need to handle the vectorization myself.

This is because my environment is responsible for running an executable program, and the parallelization is handled inside of this program. I don’t want to create multiple copies of this environment, since it ends up running many instances on a single worker.

I tried implementing my own environment class which extends VectorEnv, but unfortunately this is not supported anymore (Environments with VectorEnv not able to run in parallel):

TypeError: The environment must inherit from the gymnasium.Env class

I want to start many workers, each of which runs a single executable and this is responsible for the vectorization. The environment step function will return a batch of observations, rewards, etc.

The environment class that I provide to RLlib looks a little bit like this:

class VecEnvOrchestrator(VectorEnv):
    def __init__(self, env_config: dict):
        num_envs = env_config.get("num_envs", 1)
        observation_space = env_config.get("observation_space", None)
        action_space = env_config.get("action_space", None)

        # Start executable
        instance = lib.start("my_environment.exe")

        # Create vectorized environment
        self.vec_env = lib.VecEnv(MyEnv, instance, num_envs, observation_space, action_space)

        super().__init__(observation_space=observation_space, action_space=action_space, num_envs=num_envs)

    def vector_reset(self, *, seeds = None, options = None):
        envs = self.vec_env.envs
        return [envs[i].reset(seed=seeds[i], options=options[i]) for i in range(self.num_envs)]

    def reset_at(self, index = None, *, seed = None, options = None):
        env = self.vec_env.envs[index]
        return env.reset(seed=seed, options=options)

    def vector_step(self, actions):
        return self.vec_env.step(actions)

Is it possible to get this working?
What’s the best way to do this?

Thanks

christina · July 22, 2025, 12:19am

Hello! This is a good question, I’ll try to help out the best I can

Can you try making your environment class extend gymnasium.Env instead of VectorEnv, like this: class VecEnvOrchestrator(gym.Env) instead of VecEnvOrchestrator(VectorEnv). This class will act as a translator between RLlib and your inner vectorized stuff. Then set num_envs_per_worker=1 to ensure RLlib only creates one instance of your wrapper per worker, preventing multiple executables on the same machine.

Some helpful reading:

Getting Started — Ray 2.48.0 (under “Customizing your rl environment”)
Environments — Ray 2.48.0 (under environment vectorization)
Examples — Ray 2.48.0
Environments — Ray 2.48.0 (under performance and scaling)

Hopefully someone who has more experience with this can chime in and help

Krimo · July 22, 2025, 2:16am

Hi Christina,

Thank you for your quick response. I’ve tried making my VecEnvOrchestrator class extend gymnasium.Env but I had quite a few problems. Specifically, I require that the step function return a batch of observations, rewards, terminated, truncated and info values for each sub environment (which I’ve been unable to achieve with gymnasium.Env). I also require it to take a batch of actions, since I need to apply the actions to each sub environment handled by my inner vectorization. Hopefully this makes sense from the code snippet I provided above.

I’ve encountered some errors like 2D box spaces are not supported (when I tried to batch the observations) and that the reward is not a float (when I tried to batch the rewards) when using gymnasium.Env.

Thanks

AWS_42 · September 10, 2025, 5:56am

RLlib does use gymnasium’s “automatic vectorization” feature, but does not support the vector entry point mechanism provided in gym.register, e.g.:

gym.register(
  id="MyEnv-v1",
  entry_point=MyEnv,
  vector_entry_point=MyVectorEnv
)

It looks a possible solution is to extend from the EnvRunner class and/or SingleAgentEnvRunner and implement the custom vectorization logic therein… see here:

Topic		Replies	Views
Only use Ray to vectorize environment RLlib	4	433	July 15, 2021
Environments with VectorEnv not able to run in parallel RLlib	10	883	June 7, 2022
Custom VectorEnv for efficient parallel environment evaluation RLlib	8	910	May 27, 2021
External Env vs Vectorized Env RLlib	3	495	March 12, 2021
Training ray.rllib algorithm with vectorized environments RLlib	1	470	February 8, 2022

Integrate Custom Vectorized Environment with RLlib

Related topics