How severe does this issue affect your experience of using Ray?
- None: Just asking a question out of curiosity
When I set config["evaluation_duration"] = 5, ["evaluation_interval"] = num_iterations, stop={"training_iteration": num_iterations} with tune.run(...) and then have a callback
class SomeLogger(DefaultCallbacks):
def on_episode_start(...):
if worker.policy_config["in_evaluation"]:
print('test')
Then test get printed 6 times and I expect it to be called 5 times since evaluation evaluate 5 episode. I also have a function def on_episode_step(...) which get called n times the first 5 episodes where n is the episode length and 0 times at the 6th episode which makes sense. But ideally on_episode_start would also be called just 5 times instead of 6.
Is there a specific reason that it gets called one time extra?