Exploration in PPO and policy gradient algorithms

Hi @saeid93 ,

Give this a read!
The parameters you are looking for are entropy_coeff and entropy_coeff_schedule.

Cheers

1 Like