Receiving mixed input to my model's forward method when _disable_preprocessor_api=True (Ray version 2.3 to 2.5)

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

When I set _disable_preprocessor_api = True in my PPOConfig, I get mixed input to forward method of my model (a subclass of TorchModelv2).

I have been trying to implement central critic by adding additional observations to the agent’s own observation (using observation_fn of multi-agent config). What I observed during debugging is that, for the initial 10 or so calls to this forward method, I get correct sample input to input_dict. The ‘obs’ item of input_dict has three additional keys - own_obs, opponent_obs and opponent_action. So I could access for example opponent’s action as input_dict['obs']['opponent_action']. From around 11th call to this method, the input_dict simply contains 1 ndarray - probably the agent’s own obs. input_dict['obs'] is no longer a dict with three additional entities as above but a single numpy array. So my forward method is receiving mixed input! So I am not sure how do I handle this - as I need all the opponent data for the forward pass of my model.

Please help where I am doing wrong.

Hi @omsrisagar ,

can you provide a reproducable example?

Thank you @Lars_Simon_Zehnder .

I fixed this issue by passing in enable_connectors=False to rollouts config.

To provide more info on the above bug - I have actually tried to recreate the central critic example provided at https://github.com/ray-project/ray/blob/master/rllib/examples/centralized_critic_2.py. But as I was coming from Ray 2.1.0, I did not use the connectors feature.
(You can also notice that there is a TODO comment in the above file: # TODO(avnishn) make a new example compatible w connectors.)

The first 10 or so calls correspond to creating the trainer instance - so during that time, correct input is being given. But once I run trainer.train(), that’s when my forward method was receiving only the agent’s observation instead of the aggregate observation generated by multi_agent_config["observation_fn"].

Once I disable connectors, the observation_fn is being properly called, giving the expected input to my forward method.

1 Like