[RLlib] Variable-length Observation Spaces without padding

Here it is statet that

RLlib internally will insert preprocessors to insert padding for repeated elements.

I have a dynamic observation space that is quite heterogeneous. And I don’t want to let the space of the maximum sized observation dictate the size of all observations, because this would be quite inefficient. Also, I have my own model that handles the variable-sized inputs.

Is it possible to disable this padding? (or padding preprocessor)?

1 Like

@alexanderb14 is your custom model a TF Keras model?
I am interested in how you try to handle variable-sized inputs in your model?

BTW: I think it is helpful when you label your question “rllib”

Thanks for the answer. I just added the label.
My model is a custom PyTorch model with a Graph Neural Network architecture that can handle variable-sized graphs.

Ah, interesting… my model is build on TF Keras and as far as I know (from last year) TF has ragged tensors to handle variable-sized data but ragged tensors were not support by a Keras model yet. I don’t know if this still holds but it also doesn’t matter for your PyTorch model :sweat_smile:
I will also try to use the repeated spaces with their automatic padding and I hope learning and efficency won’t suffer from that :pray:

Maybe it is helpful for you to take a look in the documentation for rllib.models.RepeatedValues and rllib.models.Preprocessor.

1 Like

What @klausk55 said :slight_smile: !

Also, you can take a look at this example script here, which shows how to use the Repeated space as an environment’s obs space and as inputs into a simple model.

As far as I remember, the “obs” goes into the model in this example as Repeated w/o padding.

Hello @sven1977 , thanks a lot for your response!

However, padding is applied in example you mentioned (actually my code is based on this example).
For easy analysis, I printed the shape of the “items” (which is a Repeated Space) that the models gets with :
print(input_dict[“obs”].values[“items”].values.shape)

It is always the same shape, althought the environment passes different lenghts of items. So the padding is applied here in this example.

I tried to pass a custom hierarchy of preprocessors with a custom one for RepeatedValues (that doesn’t pad); however it always breaks things in RLlib.
Would this be the right approach to remove the padding?

RLlib seems be very agnostic of the shape of the observations internally and does some transformations of the observations according to it.
Is having repeated elements that don’t follow this transformation pipeline something that is not compatible with RLlib’s architecture?

Hi, I have added rudimentary support for this for my use case. The solution is very hacky and will break other things in RLlib. However I want to share it in case anyone runs into a similar issue. Tested with the DQN and PPO Algorithms.

The version of RLlib that this patch applies to is 1.1.0.