Hi, I’m using dqn
for my custom multi-agent env
. I can test my evn
with trainer
and tune
, both work without any error. After just a few training iterations, I would like to see the agent’s performance with compute_action
. But I get the following error:
AttributeError: 'numpy.ndarray' object has no attribute 'float'
I ran my code in debugger mode, and I found that the problem comes from torch_policy.py
line 276.
return self._compute_action_helper(input_dict, state_batches, seq_lens, explore, timestep)
I’m using torch vision
as the policy. in this line 276, input_dict
contains the obs
and PyTorch
expects to receive it as a Tensor. But, for some reason RLlib/Ray
stores the obs tensor
in a nparray
. As a result, this error rises.
I wonder anyone knows what is happening in the background?
Thanks!