@sven1977 Similar to rfali I also did what was described in Reproducibility issues using OpenAI Gym – Harald's blog to get the same action sequence for a specific seed in my custom env during training:
env.action_space.seed(RANDOM_SEED)
(I was using TD3)
If I don’t do that, I’ve noticed that the actions taken by the agent would be different on the second run using the same seed when training via rllib. (seeding the action space is not needed when using Stable Baselines 2 or 3 by the way, if I remember correctly. Maybe this is somehow helpful)