Hi, I’m using dqn for my custom multi-agent env. I can test my evn with trainer and tune, both work without any error. After just a few training iterations, I would like to see the agent’s performance with compute_action. But I get the following error:
AttributeError: 'numpy.ndarray' object has no attribute 'float'
I ran my code in debugger mode, and I found that the problem comes from torch_policy.py line 276.
return self._compute_action_helper(input_dict, state_batches, seq_lens, explore, timestep)
I’m using torch vision as the policy. in this line 276, input_dict contains the obs and PyTorch expects to receive it as a Tensor. But, for some reason RLlib/Ray stores the obs tensor in a nparray. As a result, this error rises.
I wonder anyone knows what is happening in the background?
Thanks!