Num_env & agent_steps_trained 0 even though steps sampled?

I have been experiencing a similar issue with off policy algorithms like DDPG and SAC when using replay buffers with storage units set to episodes. I made a post about it here: Replay buffer with episodes as storage unit not training

1 Like