How do I evaluate my trained policy after tune.fit()

rajfly · March 24, 2023, 1:48pm

How severe does this issue affect your experience of using Ray?

High: It blocks me to complete my task.

I have configured a simple DQN training script (shown below). How do I then evaluate in code (not via cli) my multiple trained policies post training. I have searched ray’s documentation but I can’t find an example showing how I would evaluate my policies as such. So for the given code below, I would have 4 trained policies after training. For each policy I would like to get the mean reward over 100 episodes. So I should have 4 values in the end.

import os
import argparse
import ray
from ray import air, tune
from ray.rllib.algorithms.dqn import DQNConfig

ray.init()

param_space = DQNConfig()\
    .framework(framework=tune.grid_search(['torch', 'tf2']))\
    .environment(env=tune.grid_search(['ALE/Boxing-v5', 'ALE/VideoPinball-v5']))\
    .resources(num_gpus=1) 

tune_config = tune.TuneConfig(
    num_samples=1,
)

run_config = air.RunConfig(
    name='base',
    stop={'timesteps_total': 1e6},
    checkpoint_config=air.CheckpointConfig(
        checkpoint_at_end=True,
    ),
    local_dir='rllib/train'
)

tune = tune.Tuner(
    'DQN',
    param_space=param_space,
    tune_config=tune_config,
    run_config=run_config,
)

tune.fit()

# how do I evaluate my trained policies here?

ray.shutdown()

Lars_Simon_Zehnder · March 30, 2023, 11:46am

Hi @rajfly ,

in general you can restore an algorithm from a checkpoint you recorded during training. This shown for example here in the documentation. You could then call

restored_algo.evaluate()

to evaluate the policy on your environment (used by the agent in the checkpoint).

Here you can also find some documentation about serving your RLlib models.

Topic		Replies	Views
[tune] How to evaluate from multiple checkpoints Ray Tune	6	658	April 19, 2021
Policy rollout on Ray Tune 2.0 RLlib	4	306	December 15, 2022
RLLib: How to use policy learned in tune.run()? RLlib	6	961	September 21, 2023
Testing model performance after every training episode Ray Libraries (Data, Train, Tune, Serve)	1	264	May 22, 2023
Run tune.Tuner with a given policy RLlib	0	14	October 18, 2024

How do I evaluate my trained policy after tune.fit()

Related topics