How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
When I set _disable_preprocessor_api = True
in my PPOConfig, I get mixed input to forward method of my model (a subclass of TorchModelv2
).
I have been trying to implement central critic by adding additional observations to the agent’s own observation (using observation_fn
of multi-agent config). What I observed during debugging is that, for the initial 10 or so calls to this forward method, I get correct sample input to input_dict. The ‘obs’ item of input_dict
has three additional keys - own_obs
, opponent_obs
and opponent_action
. So I could access for example opponent’s action as input_dict['obs']['opponent_action']
. From around 11th call to this method, the input_dict simply contains 1 ndarray - probably the agent’s own obs. input_dict['obs']
is no longer a dict with three additional entities as above but a single numpy array. So my forward method is receiving mixed input! So I am not sure how do I handle this - as I need all the opponent data for the forward pass of my model.
Please help where I am doing wrong.