Maximum recommended reward

evo11x · July 7, 2022, 10:25am

I use APPO, I have only the reward plots from ray. What kind of plots are you talking about?
No, I don’t use ray.tune yet, because I am struggling to make it learn.
The reward min and mean is going down instead of up, the maximum is going up and the len is not reaching the maximum of 100, because the actor choose to die instead of learning and if I increase the “game over” penalty then it chooses to do almost nothing to maintain a fixed reward.
I need the reward mean to be above 0 in order to have a good result.

I am thinking to use offline data to get it moving in the right direction, that’s why I asked this question here

Topic		Replies	Views
Oscillating mean reward RLlib	1	491	June 21, 2024
Scaling rewards depending on action distribution RLlib	2	359	November 3, 2021
PPO Reward Scaling RLlib	2	1186	September 3, 2021
How can i use the end of game reward as every steps reward? RLlib	4	697	November 27, 2021
Configuration for infinite horizon (continuous/non-episodic) environments? Configure Algorithm, Training, Evaluation, Scaling	0	45	July 12, 2024

Maximum recommended reward

Related topics