Use gym.wrappers for Training

Hi there,

I’m trying to Train and solve (with PPOTrainer) the CarRacing-v0 problem (

agent = PPOTrainer(config=config,env="CarRacing-v0")

It doesn’t work out of the box because I get this error (I understand this is somehow related to the observation space being to big or something like that) (this works with other env like LunarLander-v2) :

ValueError: No default configuration for obs shape [96, 96, 3], you must specify conv_filters manually as a model option. Default configurations are only available for inputs of shape [42, 42, K] and [84, 84, K]. You may alternatively want to use a custom model or preprocessor.

As suggested, I tried to use a preprocessor:

I stumble upon this piece of documentation that tells me that I should use gym.wrappers to do that

But after searching for a while, I can’t figure out a way, since the PPOTrainer constructor expect a string (which correspond to one of the open ai gym environnement), not a wrapper

I even tried to register an environment that uses the wrapper but that doesn’t work either (I’m aware that this code makes no sense but I tried anyways).

register_env("my_env",lambda config: PixelObservationWrapper(env=env))

In that case I get this error:

ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=32129, ip=
  File "/home/benjamin/.local/lib/python3.9/site-packages/ray/rllib/evaluation/", line 456, in __init__
    self.env = env_creator(copy.deepcopy(self.env_context))
  File "/home/benjamin/git/advancedparallelsystemproject2/", line 24, in <lambda>
    register_env("my_env",lambda config: PixelObservationWrapper(env=env))
  File "/home/benjamin/.local/lib/python3.9/site-packages/gym/wrappers/", line 88, in __init__
    if np.issubdtype(pixels.dtype, np.integer):
AttributeError: 'NoneType' object has no attribute 'dtype'

I cannot find any pointer on how to do this anywhere.
Really stuck here, would appreciate any help.

Alternatively, I could try to do a convolution filter or a custom model but I have no idea on how to do that either. (And to be honest I’m not sure what these mean anyways)

Sorry if I sound like I misunderstand things, that’s because I’m new to this.

Couple of suggestions:

  1. You should learn how to do convolutions in custom models because that will give you the best model for training a CarRacing simulation.
  2. I’m not an expert on it, but it doesn’t seem like PixelObsersvationWrapper will solve this problem anyways.
  3. The registration is not pretty, but it should work. Looks like the problem may actually be in the PixelObservationWrapper. Can you test using this wrapper outside of rllib and ensure that it works?
1 Like

Hey @olimar718 ,
the main problem here is that your observation space has a shape (96, 96, 3), for which RLlib does not have a default model (you don’t specify a custom model, so RLlib needs to come up with one itself, but can’t find a suitable set of conv filters for that particular shape).

@rusu24edward gave some good pointers here:

  • You can specify a custom model that handles the observation shape well (via matching Conv2D layer(s)).
  • Or you can simply configure a working choice of:
        conv2d_filters: [[num-filters, kernel, stride], [], []]  # <- find some good filter choices here that would work
  • You can specify any custom Env (e.g. your wrapped one) in your convig via the tune.register_env([some env name], lambda env_config: [return some env object]) tool OR directly giving the full classpath of your custom Env class, e.g.:
    env: ray.rllib.examples.env.random_env.RandomEnv,

# w/ tune.register_env:
    env: "your registered env name here"

I’ll add the 96, 96, 3 format to our defaults so that RLlib should work out of the box with that env in the future.

Thanks for raising this issue @olimar718 and thanks for your help @rusu24edward !

1 Like

A quick fix would be to use this config:

    conv_filters: [[16, [8, 8], 4], [32, [4, 4], 2], [256, [11, 11], 2]],

This should work with a 96x96x3 obs space.

@sven1977 ,

is there a specific reason (other than sparing parameters) why you did not use a [256, [12, 12], 1] filter at the end? As this leaves out the last row and last column of the feature maps in the layer before, I guess.