Reproduce DDPPO algorithm's result

seungju-mmc · October 5, 2022, 5:30am

How severe does this issue affect your experience of using Ray?

None: Just asking a question out of curiosity
Low: It annoys or frustrates me for a moment.
Medium: It contributes to significant difficulty to complete my task, but I can work around it.
High: It blocks me to complete my task.

Hello, I am seungju.
I have a problem about reproducing the distributed reinforcement learning algorithm such as DDPPO algorithm.
Here is my ddppo configuration which triggers reproducing problems:

I found that the algorithm can not be reproduced ifremote_worker_envs is true.

I guess that it is not reproducible because it depends on the parallel sampling which can be easily affected by hardware factors.

I wonder if it’s reproducible. If it is not reproducible, I would appreciate it if you answer to me.

Thanks.

kourosh · January 5, 2023, 7:00pm

Hi @seungju-mmc, Unfortunately DDPO (much like APPO) is an async algorithm and reproducibility is not guaranteed for these algos. As you said it depends on a lot of factors, including hardware and os state differences.

Topic		Replies	Views
Does DDPPO example code uses GPU? Ray Tune	1	374	October 26, 2021
TensorFlow MultiWorkerMirroredStrategy in RLlib RLlib	1	488	May 31, 2022
Not Sure Which RLlib Algorithm To Use RLlib	5	640	April 27, 2021
Reproducibility of training Results on PPO algorithm RLlib	4	462	September 24, 2021
Performance of algorithms RLlib	3	596	September 2, 2021

Reproduce DDPPO algorithm's result

Related topics