Yet another question on RNN sequencing

mannyv · January 2, 2022, 4:49pm

There are 3 (sometimes 4) distinct phases that the model is called in.

Your debugging has revealed 2 of them.

These 3 are from the initialization phase. The y are taken from a dummy batch of all zeros. I ignore these unless they are generatimg an error.

torch.Size([32, 1, 2])
torch.Size([1, 1, 2])
torch.Size([4, 8, 2])

These are from the rollout phase when compute actions is called to sanple new trajectories from the environment. Your config has 20 envs per worker each of which is taking 1 step.

torch.Size([20, 1, 2])
torch.Size([20, 1, 2])

After you collect 4000 steps the training phase will run. You have not reported that phase but when you hit it it will have shape [num_episodes,max_seq,2]. The max_seq is dynamic by default so if you did not have an episode that lasted 20 steps then it will be shorter than that.

Happy New Year

Topic		Replies	Views
Custom RNN Model with Examples - why do they fail? RLlib	11	2334	May 5, 2022
Custom_experiment.py modified version RLlib	4	488	October 19, 2021
RNN support + RAM usage for RL algorithms RLlib	2	214	January 17, 2023
Question on code: _wrapped_forward RLlib	2	345	November 30, 2021
State shapes incorrect using custom model (TorchModelV2) (PPO) RLlib	2	426	July 15, 2021

Yet another question on RNN sequencing

Related topics