Unable to use Dict observation space

my goal is to train a NN for a card game with self-play; this is my first experience in this field (I am using ray-2.0.0.dev0). In my PettingZoo environment I have:

        self.action_spaces = {agent: Dict({
            GameState.BIDDING: Discrete(BID_ACTIONS),  
            GameState.CHOOSE_TRUMP: Discrete(CHOOSE_TRUMP_ACTIONS),
            GameState.TRICK: Discrete(TRICK_ACTIONS)
        }) for agent in self.agents}
        self.observation_spaces = {agent: Dict({
            'observations': Dict({
                'gamestate'    : Discrete(len(GameState)),
                'player_hand'  : Box(low=0, high=1, shape=(TRICK_ACTIONS,), dtype=bool),
                # [...] other state I keep temporarily commented
            'action_mask': Dict({
                GameState.BIDDING:      Box(low=0, high=1, shape=(BID_ACTIONS,), dtype=bool),
                GameState.CHOOSE_TRUMP: Box(low=0, high=1, shape=(CHOOSE_TRUMP_ACTIONS,), dtype=bool),
                GameState.TRICK:        Box(low=0, high=1, shape=(TRICK_ACTIONS,), dtype=bool)
        }) for agent in self.agents}

To be able to apply action masks, as suggested in the documentation, I defined a custom model:

class ParametricActionsModel(TFModelV2):
    def __init__(self,
            obs_space, action_space, num_outputs, model_config, name, **kw)
        self.prep = get_preprocessor(obs_space.original_space.spaces['observations'])
        orig_obs_space = self.prep(obs_space.original_space.spaces['observations'])
        self.action_embed_model = FullyConnectedNetwork(
            name + "_action_embedding"

    def forward(self, input_dict, state, seq_lens):
        # Extract the available actions tensor from the observation.
        action_mask = input_dict["obs"]["action_mask"]

        # Compute the predicted action embedding
        orig_obs = self.prep.transform(input_dict["obs"]["observations"])
        action_logits, _ = self.action_embed_model({
            "obs": orig_obs

        # Mask out invalid actions (use tf.float32.min for stability)
        inf_mask = tf.maximum(tf.math.log(action_mask), tf.float32.min)
        return action_logits + inf_mask, state

    def value_function(self):
        return self.action_embed_model.value_function()

However, I get the error below:

  File "E:/Vmware shared folder/python/BriscolaChiamata/train.py", line 44, in forward
    orig_obs = self.prep.transform(input_dict["obs"]["observations"])
TypeError: transform() missing 1 required positional argument: 'observation' at time: 1.64718e+09

The point is: the actual observation I want to feed the NN with is input_dict[“obs”][“observations”], but I don’t know how to actually pass it to self.action_embed_model(). I am not even sure the ParametricActionsModel.__init__() does the right things.

By the way, a previous temporary version of this code, where I had:

self.observation_spaces = {agent: Dict({
            'observations': Box( #...
            'action_mask': # as above

and the custom ParametricActionsModel (without the preprocessor and the transform()), did not cause errors. However, using a Dict space helps me to keep the code clearer.

Can someone please help me? I did not found existing examples helpful enough for my (very low) level of expertise.
Thanks in advance

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hi @Loneknight73 ,

I don’t get this error at the moment, since your preprocessor should only need one argument: the observation.

But I noticed you are using get_preprocessor() instead of get_preprocessor_for_space().
Does that mess up your usage of the preprocessor?

Also: Have a look at this!

Hi @arturn ,
thanks for your answer. I could not try your suggestion, but I remember orig_obs_space having the right shape.

yes, I saw it. It seems to me custom preprocessors are deprecated, but builtin ones are not. It’s not clear if there are other automated ways to convert a dictionary into a suitable NN input.