Hi All,
I have a question regarding how big should the rewards be? I currently have a reward of 1000. Then any punishments or rewards (per step and at the very end) calculated based off of that amount.
For example:
reward = 0
if self.tinyPunish:
self.tinyPunish = False
reward -= firstPlace * 0.0001
if self.smallPunish:
self.smallPunish = False
reward -= firstPlace * 0.001
if self.mediumPunish:
self.mediumPunish = False
reward -= firstPlace * 0.01
if self.strongPunish:
self.strongPunish = False
reward -= firstPlace * 0.1
…a bit further down…
if tieredUp == 10:
reward += firstPlace * 0.02
elif tieredUp == 11:
reward += firstPlace * 0.08
if self.leveledUp:
if (self.level > 4) and ((self.boardUnitCount() + 1) >= self.level): # don't want to reward for rushing early levels as I think that's just dumb
"""
Reward for getting to level: 5: 12.5
Reward for getting to level: 6: 21.6
Reward for getting to level: 7: 34.3
Reward for getting to level: 8: 51.2
Reward for getting to level: 9: 72.9
Reward for getting to level: 10: 100.0
"""
award = firstPlace * 0.0001 * (self.level ** 3)
print(f"Awarded: {award} for leveling up with: {self.boardUnitCount()} heroes!")
reward += award
self.leveledUp = False
I was wondering if that is okay, or do I need to scale everything between 0 and 1 per step? Or make sure that rewards don’t exceed 1 per episode?