How can I deploy my reinforcement learning model trained with tune using the new API?

Daraan · September 10, 2025, 10:25am

I am not entirely sure if its relevant for your case. If not at least its a nice to know, do you know that the episode_return_mean is smoothed by config.metrics_num_episodes_for_smoothing? See the topic I just have posted:

In short the min/mean/max you obtain by using local_env_runner.get_metrics() are from the last metrics_num_episodes_for_smoothing sampled episodes - not bound to an iteration.
Furthermore (depending on your ray version), restoring metrics is broken, see [RLlib] Checkpoint metrics loading with Tune is broken in 2.47.0 · Issue #53877 · ray-project/ray · GitHub. In your case I think the smoothing from the old episodes (if your windows reaches there) can be off / lost. So possibly you only get the smoothed value from after you loaded the checkpoint.

Maybe you have to cross check if things are maybe correct but not logged like you would expect.
Cheers, and good luck.

Topic		Replies	Views
Policy rollout on Ray Tune 2.0 RLlib	4	342	December 15, 2022
Compute/display actions from ray.tune RLlib	10	1724	March 30, 2021
[Tune][RLlib] How to use a Tune-trained (RNN) model for inference? RLlib	4	1100	June 27, 2021
Another tune after restoring a PPO algorithm Checkpointing, Restoring	2	345	December 15, 2023
Inference with a trained model RLlib	1	17	January 16, 2026

How can I deploy my reinforcement learning model trained with tune using the new API?

Related topics