Different results on GPU and CPU

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I am training a multi-agent RL algorithm, very similar to this example but with my own custom environment. I recognized that only changing ‘num_gpus’ from 0 to 1 completely changes my results, i.e. training on gpu gives worse results than on cpu. I know that there might be differences in results due to hardware, but I don’t assume that these differences can be that significant. See the attached image of mean rewards (cpu left, gpu right)

I have an i9-13900H and RTX 3080 Ti Notebook. I am using Ray 2.4.0, torch 2.0 and Cuda 12.2.
I know I should update to Ray 2.7 but I didn’t have the time to adjust my code to the new API.

What may be the reasons for that?