I’d like to know if I can add an attribute to my
DefaultCallbacks and have it persist across multiple on_episode_end callbacks.
What i want is a metric that will average a value over the last 100 episodes. However, there are two things I need in this metric:
- if the model completed less than 100 episodes, I still want to sum them up and divide by 100.
- At some points in training, I need to “reset” this list of 100 episodes as if the model were starting again.
So, my thoughts were “let’s make a custom_metric.” If I add a
collections.deque to my
maxlen=100 and append to it at every
on_episode_end, I can check it at the
on_train_result callback (where I need it) and add it to the results dictionary. However, I do not know exactly if, when RAY parallelizes into multiple workers, there will be multiple instances of this callback object. Will I be able to trust this value? the
on_episode_end callback is called in a “centralized” way?