Stopping criteria for PPOTrainer

I am using a PPOTrainer with the following code:

...
config = DEFAULT_CONFIG.copy()
trainer = PPOTrainer(config=config, env=select_env)
result = trainer.train()
...

How can I include a stopping criteria in the PPOTrainer?

Hi @carlorop ,

try to use tune for running your algorithm. It is also the recommended way to run algorithms of RLlib and will be the standard in the near future as ray’s different components get more and more efficiently combined.

This is how you can run your algorithm with tune and set some stopping criteria:

import tune
config["env"] = select_env
tune.run(
      trainer,
      config=config,
      stop={"training_iteration": 10000},
      fail_fast=True,
)

Hope this helps

2 Likes

Hi @carlorop ,

To add to @Lars_Simon_Zehnder’s advice:
You can alter a lot about the stopping behaviour of tune experiments:
From the docstrings:

stop (dict | callable | :class:`Stopper`): Stopping criteria. If dict,
            the keys may be any field in the return result of 'train()',
            whichever is reached first. If function, it must take (trial_id,
            result) as arguments and return a boolean (True if trial should be
            stopped, False otherwise). This can also be a subclass of
            ``ray.tune.Stopper``, which allows users to implement
            custom experiment-wide stopping (i.e., stopping an entire Tune
            run based on some time constraint).
1 Like