How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
I am running multi-agent RL experiments with RLLib and would like to extend the Checkpoint logic. In particular, I am looking for two extra functionalities:
Saving checkpoints based on multiple criteria. I would like to have one checkpoint for both the best episode_reward_mean and the episode_reward_max.
Saving a checkpoint (potentially a third one besides the aforementioned) based on the individual agent’s metric. For example, I would like to save for each agent separately the policy that achieved the best individual episode reward.
Is there already a way to achieve this that I could not find yet? Thanks for the help!