How severe does this issue affect your experience of using Ray?
Low: It annoys or frustrates me for a moment.
Hello everyone,
I’m trying to create a custom actor-critic model with LSTM similar to this:
But with the addition of 2 different LSTMs one for the actor and one for the critic as in
One thing that is not clear to me is how should I set the time length of the LSTM. It looks like rllib by default set it to 32 in my code, but actually in my configuration i never used that value:
The time length will vary depending on if you are in the sample phase or the train phase.
In the sample phase the time dimension will be 1 because rllib generates actions for each step on at a time.
During the training phase, which your configuration the max size of the time dimension will be 20 based in your max_seq_len setting. That also serves as the truncation length for tbptt. It is likely you will have sequences that are shorter than 20 though. This will happen if your episode is shorter than 20 or if you are using the truncate_batch sample setting and it pauses in the middle of an episode because you have hit your rollout_fragment_length.
Hi @mannyv ,
thanks for the info. Actually, I inserted a breakpoint inside the forward call of my custom LSTM model and I saw the shape of the seq_len which is a torch tensor of size 32 with all ones. From what you are telling me, the values 1 are probably the time dimension since I’m in the sampling phase.
The strange thing then is that the batch_size is 32, but I don’t think I have set it to 32 in my config.
I tried to continue with debugging and at some point the seq_len becomes tensor ([8, 8, 8, 8], dtype = torch.int32).
P.S. just to give a little bit of context, in my environment i have three agents which observe a state space of dimension 18, 9 are the real observations and 9 are the action_mask observations.
So in the end I’d like to understand a little better how batch sizes and time dimensions are handled in general, not necessarily in the context of my environment.
EDIT:
By following the function calls, I’ve ended up here
where batch size is defined as 32 and consequently the seq_len is defined as
Follow my posts in this thread. They might have some of the info you are interested in.
That code is some initialization code There are about 3 calls to forward at the very beginning before training starts that are used to set up ViewRequirements.
Hi @ColdFrenzy i am experiencing the same problem and completely blocked my work, I am getting crazy can you maybe explain how did you solve the issue? Thanks!
Me too, I’m getting the exact same problem with the batch being 32 by default when arriving in the forward method with the seq_lens tensor being full of 1s.
And I suspect this is causing the error :
File “C:\Users\marko\anaconda3\envs\rllib-torch\lib\site-packages\ray\rllib\evaluation\worker_set.py”, line 181, in init
raise e.args[0].args[2]
RuntimeError: shape ‘[5504, 1]’ is invalid for input of size 12288
I don’t know where this shape and input size are coming from. My environment outputs states of shape (96, 36) representing one sequence per step.
My entire dataset is batched with a size of 128.