Use `checkpoint_score_attr` with custom metric

fedetask · May 9, 2022, 9:37am

How severe does this issue affect your experience of using Ray?

Medium: It contributes to significant difficulty to complete my task, but I can work around it.

USING RAY==1.8.0 (cannot upgrade for now)

I’m trying to use the checkpoint_score_attr param in tune.run() but I’m having some difficulties understanding when and how it works. From what I read, I should set checkpoint_score_attr=metric where metric is a key in dictionary returned by tune.run().results. How can I add things to this result dictionary?

Another question: how can I tell tune to look at the evaluation metric and not at the training one?

matthewdeng · May 10, 2022, 3:57am

Hey @fedetask, can you share what your Trainable looks like?

In general, if you’re using a function Trainable then you can report the metric with tune.report(**kwargs). With this, you can pass in the evaluation metric, the training metric, or both!

Let me know if this documentation helps!

fedetask · May 10, 2022, 10:02am

I should have specified, I’m using Tune to train RLlib agents. Therefore I do something like

config = {
    'callbacks': MyCallback,  # Computes useful metrics during training
    'evaluation_config': {
        'callbacks': MyEvaluationCallback  # Computes useful metrics in evaluation episodes
    }
}
tune.run(run_or_experiment=DQNTrainer, config=config)

(I skipped non-relevant configuration elements such as environment, stopping criteria, etc)

The metrics that I would like Tune to consider when keeping the best checkpoints are computed by MyEvaluationCallback, which looks like this

class MyEvaluationCallback:

   # ... other methods that save info in episode.user_data

    def on_episode_end(self,
                       *,
                       worker: RolloutWorker,
                       base_env: BaseEnv,
                       policies: Dict[PolicyID, Policy],
                       episode: MultiAgentEpisode,
                       env_index: Optional[int] = None,
                       **kwargs) -> None:
        episode.custom_metrics = # Useful metrics

Should I add a tune.report() call inside on_episode_end()?

matthewdeng · May 11, 2022, 1:48am

Ah gotcha, in that case you should be able to store the (evaluation) metric as part of custom_metrics and then reference it forcheckpoint_score_attr.

See Callbacks and Custom Metrics for some more info and examples!

Topic		Replies	Views
Which attributes can be used in `checkpoint_score_attr` when using `tune.run` RLlib	10	1203	April 20, 2022
Saving checkpoints with good custom_metric using tune.run() Ray Tune	18	2284	July 20, 2021
Custom metrics over evaluation only RLlib	8	1765	December 16, 2021
Store best checkpoints according to evaluation metrics Checkpointing, Restoring	0	381	June 19, 2023
Accessing rllib evaluation in tune.Analysis Ray Tune	5	1029	June 17, 2021

Use `checkpoint_score_attr` with custom metric

Related topics