PPO with beta distribution

jgonik · June 17, 2021, 12:34am

Hi! I’m interested in experimenting with using a beta distribution instead of a Gaussian distribution during PPO training. However, looking at the source code in tf_action_dist.py, it doesn’t look like the beta distribution implementation has kl or entropy functions. Am I looking in the wrong place, or would I have to implement these functions myself in order to make the beta distribution code compatible with PPO? Thanks!

hakeemta · March 2, 2023, 7:36am

@jgonik, did you find a way around this? I have the situation. Thanks.

Topic		Replies	Views
Output of PPO with discrete actions RLlib	4	1090	December 15, 2022
Breakdown of config and metrics of PPO implementation RLlib	0	673	February 23, 2022
Implementing Dirichlet distribution RLlib	5	755	March 12, 2022
Output from custom policy network for PPO RLlib	1	448	November 15, 2022
Fetch action probability distribution from trained policy RLlib	7	662	March 18, 2023

PPO with beta distribution

Related topics