Hi! I’m interested in experimenting with using a beta distribution instead of a Gaussian distribution during PPO training. However, looking at the source code in tf_action_dist.py
, it doesn’t look like the beta distribution implementation has kl
or entropy
functions. Am I looking in the wrong place, or would I have to implement these functions myself in order to make the beta distribution code compatible with PPO? Thanks!
2 Likes
@jgonik, did you find a way around this? I have the situation. Thanks.