Can anyone give me a hint on how to set up a custom network for the R2D2 algorithm that uses an LSTM? In principle I have something like observation-> linear1-> linear2-> LSTM-> Q values
. The API is poorly documented to the maximum and incomprehensible. Especially since it also looks like I can’t just override forward_rnn
, but also have to override the get_q_values function for some impossible (because undocumented) to guess reason. I’m trying to figure out this mess here but the whole thing is incomprehensible to the max:
https://docs.ray.io/en/latest/rllib-models.html#custom-model-apis-on-top-of-default-or-custom-models
Thanks in advance.