I’m trying out BC on some different gym env and have been encountered some issues I hoped someone could help me out with. I’ve been using a custom rnn from the rllib PPO example from ncps.readthedocs.io where the trajectories have been stored using SampleBatch as shown in the documentation. But I had to do add some of the keys due to errors, so the callback is updating also updating “eps_id”, “agent_index” and I had to set “state_in_0” to zeros of the size of my rnn.
As I understod from sampleBatch.py “eps_id”, “agent_index” and “state_in_0” should be set by rllib and I had no problem with when I didn’t use a custom model. Is this something that I have to set because of the custom om model, or because of the rnn? state_in_0 I assume is from the rnn, but it’s built on the RecurrentNetwork class so I’m a bit suprised that came up.
Where everything stops is during the training. In /rllib/policy/rnn_sequencing.py I get a TypeError in line 289 as it says unroll_ids is NoneType. Anyone who knows how this can be fixed?