Bets way to handling policy change

I tested my env and I didn’t get the results that I expected, I repeated my RL with slightly modifications in model and etc but i got same results. the problem is results getting better until some big updates, I thought PPO clip param should address this problem but obviously it doesn’t (from what I get). I remember I saw an article (TL;DR) arguing PPO doesn’t solve this problem always and TRPO is more general but I don’t know what was the reason. any guide that how can I get through this?
2022-05-17 15_06_55-Bokeh Plot and 16 more pages - Personal - Microsoft​ Edge
2022-05-17 15_06_27-Settings