So I use custom_metrics to log some environment information each step, but I have noticed it does not show the values I expect.
For example I have one value I log that only ever take the value 0 or 1, but in the logging it shows a mean value of between 2-10 which should not be possible. I currently log each step with the callback on_episode_step and run it in a continuing environment with soft_horizon=True and horizon=1000.
I assumed it might be something with the horizon being 1000 long (though I really thought that would be what we had the min/max/mean from) and also tried horizon=1 with similar results.
How does the metric logging work and what is it that I’m misunderstanding here?
Hey @albheim, thanks for raising this issue.
That’s indeed strange. If you are certain that your env does not produce values outside 0 and 1, could you maybe try to reproduce this issue in a small, self-sufficient script so I can debug this?
Yeah, as everyone probably expected this was an error on my side
Logged both the variable (if a certain job was dropped or not) and the cost of dropping that job (for later use in the reward function). The cost was currently set to 10 and so it nicely rescaled the interval 0-1 to 0-10 and made me confused when I looked at the logged value dropped cost instead of the dropped state. I thought I must have misunderstood something about how ray aggregated metrics and all of that, but it all worked as I initially assumed.
Thanks for the help and sorry for taking up your time