Testing model performance after every training episode

overloader · May 21, 2023, 11:29pm

Hello

I would like that after each iteration, the agent checks the model on a dataset that it has not previously seen in order to understand if there are changes in the performance of the model and, if so, save the checkpoint of a particular iteration. How can I do that ? Perhaps there are examples where I could read about it

Learning Algorithm - PPO
I read about the fact that you can use 2 gym environment. One for training and second for testing. But not have ideas about implementation
maybe i can do it using tune.Tuner or with classic algo.train()

Update: maybe validation_env solve my problem ?

My code:

    config = (
        PPOConfig()
        .environment(env="TrainEnvironment")
        .framework("torch")
        .rollouts(num_rollout_workers=0, num_envs_per_worker=1)
        .resources(num_gpus=1)
    )

    algo = config.build()

    for _ in range(1):
        result = algo.train()
        print(pretty_print(result))
        algo.validate_env(TrainEnv, config_test) # maybe this can help

gjoliver · May 22, 2023, 10:26pm

Take a look at evaluation related configs you can use:

github.com

ray-project/ray/blob/master/rllib/algorithms/algorithm_config.py#L1815-L1834


      
          def evaluation(
              self,
              *,
              evaluation_interval: Optional[int] = NotProvided,
              evaluation_duration: Optional[Union[int, str]] = NotProvided,
              evaluation_duration_unit: Optional[str] = NotProvided,
              evaluation_sample_timeout_s: Optional[float] = NotProvided,
              evaluation_parallel_to_training: Optional[bool] = NotProvided,
              evaluation_config: Optional[
                  Union["AlgorithmConfig", PartialAlgorithmConfigDict]
              ] = NotProvided,
              off_policy_estimation_methods: Optional[Dict] = NotProvided,
              ope_split_batch_by_episode: Optional[bool] = NotProvided,
              evaluation_num_workers: Optional[int] = NotProvided,
              custom_evaluation_function: Optional[Callable] = NotProvided,
              always_attach_evaluation_results: Optional[bool] = NotProvided,
              enable_async_evaluation: Optional[bool] = NotProvided,
              # Deprecated args.
              evaluation_num_episodes=DEPRECATED_VALUE,
          ) -> "AlgorithmConfig":

There are also examples available utilizing these settings.
Let us know if anything looks confusing.

Topic		Replies	Views
Policy rollout on Ray Tune 2.0 RLlib	4	316	December 15, 2022
Tune.run() doesn't work. runs endlessly Ray Tune stopping condition & comparisons	1	543	November 2, 2023
PPO.train incorrect result RLlib	1	259	May 23, 2023
How do I evaluate my trained policy after tune.fit() RLlib	1	713	March 30, 2023
A little help for a novice RLlib	1	433	October 26, 2022

Testing model performance after every training episode

Related topics