Reproducibility Concerns with GPU

Hey guys,

So my team is concerned with the lack of reproducibility of our agent when training it with a GPU.

Some context, we are using a SAC (Ray 2.0), with the following config
image

Implemented the following suggestions:
https://github.com/ray-project/ray/blob/master/rllib/examples/deterministic_training.py (action space seeding)

Sanity check: Ran 2 training sessions (10K steps each) both SACs used the exact same config and seed number and here’s the reward curve (total reward vs ep number)

This is the seeding method I’m using, implemented both in the body and main of my script

Is there a setting I’m forgetting to use?

@kourosh one of your favourite topics? :slight_smile:

1 Like

This is what we have observed as well.
Same workload that is perfectly deterministic on CPU is not so deterministic on GPU.
I did some Google searches back then and concluded that because of the Parallelism and Asynchronous nature of GPU execution, it’s hard to make GPU training completely deterministic.
Long story short, I don’t think you forgot about any configuration.
And please share if you discover more about this topic.

3 Likes