PPO with beta distribution

Hi! I’m interested in experimenting with using a beta distribution instead of a Gaussian distribution during PPO training. However, looking at the source code in tf_action_dist.py, it doesn’t look like the beta distribution implementation has kl or entropy functions. Am I looking in the wrong place, or would I have to implement these functions myself in order to make the beta distribution code compatible with PPO? Thanks!

1 Like