I am using a PPOTrainer with the following code:
...
config = DEFAULT_CONFIG.copy()
trainer = PPOTrainer(config=config, env=select_env)
result = trainer.train()
...
How can I include a stopping criteria in the PPOTrainer?
I am using a PPOTrainer with the following code:
...
config = DEFAULT_CONFIG.copy()
trainer = PPOTrainer(config=config, env=select_env)
result = trainer.train()
...
How can I include a stopping criteria in the PPOTrainer?
Hi @carlorop ,
try to use tune
for running your algorithm. It is also the recommended way to run algorithms of RLlib and will be the standard in the near future as ray’s different components get more and more efficiently combined.
This is how you can run your algorithm with tune
and set some stopping criteria:
import tune
config["env"] = select_env
tune.run(
trainer,
config=config,
stop={"training_iteration": 10000},
fail_fast=True,
)
Hope this helps
Hi @carlorop ,
To add to @Lars_Simon_Zehnder’s advice:
You can alter a lot about the stopping behaviour of tune experiments:
From the docstrings:
stop (dict | callable | :class:`Stopper`): Stopping criteria. If dict,
the keys may be any field in the return result of 'train()',
whichever is reached first. If function, it must take (trial_id,
result) as arguments and return a boolean (True if trial should be
stopped, False otherwise). This can also be a subclass of
``ray.tune.Stopper``, which allows users to implement
custom experiment-wide stopping (i.e., stopping an entire Tune
run based on some time constraint).