Errors during training using BC with custom rnn

Hi all!
I’m trying out BC on some different gym env and have been encountered some issues I hoped someone could help me out with. I’ve been using a custom rnn from the rllib PPO example from ncps.readthedocs.io where the trajectories have been stored using SampleBatch as shown in the documentation. But I had to do add some of the keys due to errors, so the callback is updating also updating “eps_id”, “agent_index” and I had to set “state_in_0” to zeros of the size of my rnn.

As I understod from sampleBatch.py “eps_id”, “agent_index” and “state_in_0” should be set by rllib and I had no problem with when I didn’t use a custom model. Is this something that I have to set because of the custom om model, or because of the rnn? state_in_0 I assume is from the rnn, but it’s built on the RecurrentNetwork class so I’m a bit suprised that came up.

Where everything stops is during the training. In /rllib/policy/rnn_sequencing.py I get a TypeError in line 289 as it says unroll_ids is NoneType. Anyone who knows how this can be fixed?

Thanks,
Robin

Hi @R-Liebert,

As far as I know BC does not support RNNs. This is from a long time ago so it may have changed but as far as I could tell it has. not.

Thanks for the reply! Seems like it’s still unsupported as I ran the same sampleabatch on another algo without issues. As I’m still new to rllib it was a good exercise in getting to know the library :slightly_smiling_face:

Edit: R2D2 has the same issues. So I’m going back to searching for errors as the docs states that rnn is supposed to work in BC aswell.