What exactly is the expected return (shape and to what do the dimensions correspond to, also what if the batch_size is only 1) from the forward function of the TorchModelV2 if it is used to implement a RNN. I have the same question regarding the value_function(). First and foremost: Do I have to keep the “Sequence” dimension for the return values in either of the forward() or value_function() function?
This is possible according to the documentation:
Note that the inputs arg entering forward_rnn is already a time-ranked single tensor (not an input_dict!) with shape (B x T x ...). If you further want to customize and need more direct access to the complete (non time-ranked) input_dict, you can also override your Model’s forward method directly (as you would do with a non-RNN ModelV2). In that case, though, you are responsible for changing your inputs and add the time rank to the incoming data (usually you just have to reshape).
I want to use the “normal” TorchModelV2 since I want to have more control over the inputs, since my state consists of a Dict and I want to handle individual parts differently, while the TorchRNN model just flattens everything and concatenates it.
How severe does this issue affect your experience of using Ray?