Torch tensor observation is resulting in error during training

  • High: It blocks me to complete my task.

Hi,

I’m encountering an issue while training my model using a Torch tensor as observation, specifically within the agent_collector.py file. The problem arises in the _to_float_np_array method, where it’s raising a ValueError when dealing with Torch tensors.

This becomes a bottleneck in my workflow, preventing smooth model training with Torch tensors as observations. I’ve been exploring workarounds or potential fixes for this issue but haven’t had much luck. Has anyone else faced a similar problem or found a solution to handle Torch tensors within the agent_collector.py module?

Your insights or suggestions on how to resolve this issue would be greatly appreciated! If anyone has encountered and resolved this particular problem before, I’d love to hear your thoughts or solutions.

Thanks in advance for any help or guidance you can provide!

This is part of my error log

sample_batch = agent_collector.build_for_inference()
(DQN pid=8423) File “/usr/local/lib/python3.10/dist-packages/ray/rllib/evaluation/collectors/agent_collector.py”, line 366, in build_for_inference
(DQN pid=8423) self._cache_in_np(np_data, data_col)
(DQN pid=8423) File “/usr/local/lib/python3.10/dist-packages/ray/rllib/evaluation/collectors/agent_collector.py”, line 613, in _cache_in_np
(DQN pid=8423) cache_dict[key] = [_to_float_np_array(d) for d in self.buffers[key]]
(DQN pid=8423) File “/usr/local/lib/python3.10/dist-packages/ray/rllib/evaluation/collectors/agent_collector.py”, line 613, in
(DQN pid=8423) cache_dict[key] = [_to_float_np_array(d) for d in self.buffers[key]]
(DQN pid=8423) File “/usr/local/lib/python3.10/dist-packages/ray/rllib/evaluation/collectors/agent_collector.py”, line 33, in _to_float_np_array
(DQN pid=8423) raise ValueError