Hello
I would like that after each iteration, the agent checks the model on a dataset that it has not previously seen in order to understand if there are changes in the performance of the model and, if so, save the checkpoint of a particular iteration. How can I do that ? Perhaps there are examples where I could read about it
Learning Algorithm - PPO
I read about the fact that you can use 2 gym environment. One for training and second for testing. But not have ideas about implementation
maybe i can do it using tune.Tuner or with classic algo.train()
Update: maybe validation_env solve my problem ?
My code:
config = (
PPOConfig()
.environment(env="TrainEnvironment")
.framework("torch")
.rollouts(num_rollout_workers=0, num_envs_per_worker=1)
.resources(num_gpus=1)
)
algo = config.build()
for _ in range(1):
result = algo.train()
print(pretty_print(result))
algo.validate_env(TrainEnv, config_test) # maybe this can help