Hi Ray community,
I am using successfully the checkpointing combination of RLlib
, Tune
and Air
, as shown in the example below.
It would be beneficial for me to create a checkpoint not in a fixed interval, like every 10th iteration, but every time I reach a better value for a metric, e.g. “episode_reward_max” or a even a custom metric.
So in words, it would be like “create a checkpoint every time you reach a lower value for episode_reward_max
than you did in all iterations before”.
Has anyone experience with that?
Example of a fixed interval of 10 iterations to create checkpoint:
tuner = tune.Tuner(
"PPO",
param_space=config,
run_config=air.RunConfig(
checkpoint_config=air.CheckpointConfig(
checkpoint_score_attribute="episode_reward_mean",
checkpoint_score_order="max",
checkpoint_frequency=10,
checkpoint_at_end=True,
),
tune_config=tune.TuneConfig(num_samples=2)
)