How severe does this issue affect your experience of using Ray?: High
Running into an issue where my training completely stops and reward drops without any attempts to recover. I am using PPO algorithm on a custom environment using Ray 2.7.1. This is how the episode reward mean looks like for all my trials:
Has anyone had this issue and if so, how did you configure your run to make it work ?
Thanks so much !