How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
When I set
_disable_preprocessor_api = True in my PPOConfig, I get mixed input to forward method of my model (a subclass of
I have been trying to implement central critic by adding additional observations to the agent’s own observation (using
observation_fn of multi-agent config). What I observed during debugging is that, for the initial 10 or so calls to this forward method, I get correct sample input to input_dict. The ‘obs’ item of
input_dict has three additional keys -
opponent_action. So I could access for example opponent’s action as
input_dict['obs']['opponent_action']. From around 11th call to this method, the input_dict simply contains 1 ndarray - probably the agent’s own obs.
input_dict['obs'] is no longer a dict with three additional entities as above but a single numpy array. So my forward method is receiving mixed input! So I am not sure how do I handle this - as I need all the opponent data for the forward pass of my model.
Please help where I am doing wrong.