"Iteration always 1" challenge

PhilippWillms · April 22, 2023, 12:32am

Even though I read the Tune FAQ article Why are all my trials returning “1” iteration?, I am concerned that my implementation is working. Indeed I face the topic of all 5 trials returning 1 iteration.

How can I control that algorithm training has been properly executed? Should I look at the num_sgd_iter?

    tuner = tune.Tuner(
        "PPO",
        param_space=config,
        run_config=RunConfig(
            stop=stopping_criteria,
            checkpoint_config=CheckpointConfig(
                checkpoint_score_attribute="episode_reward_mean",
                checkpoint_score_order="max",
                checkpoint_frequency=2,
            ),
        ),
        tune_config=tune.TuneConfig(
            metric="episode_reward_mean",
            mode="max",
            num_samples=5,
            reuse_actors=False,
            max_concurrent_trials=3
        ),

kai · April 24, 2023, 10:53am

The interesting part here would be the defined stopping_criteria. Can you share those with us?

RLlib returns a number of metrics that can be checked to see if training took place, but yes, the num_sgd_iter is a good candidate to see how often SGD optimization has been triggered.

PhilippWillms · April 24, 2023, 9:18pm

The stopping criteria are as follows, to be finetuned:

stopping_criteria = {
        "training_iteration": 5,
        "timesteps_total": 6,
        "episode_reward_mean": 5,
    }

While the iter in the trial summary output always remains 1, indeed the num_sgd_iter and the ts are different. However, the num_sgd_iter is a tuning parameter initialized by

"num_sgd_iter": tune.randint(100, 1000),

kai · April 25, 2023, 9:36am

The stopping criteria are OR conditions, i.e. as soon as one of the conditions applies, the training is stopped. I would assume that timesteps_total achieves 6 in the first iteration, hence you see only one result.

If you comment out timesteps_total and episode_reward_mean from the stopping criteria, each trial should run to 5 iterations.

Topic		Replies	Views
Understanding stop training_iteration parameter Ray Tune	4	874	September 24, 2021
Tune.run() runs more iterations than `training_iteration` RLlib	5	494	July 8, 2021
[Tune FAQ] Formula for estimating iteration & timesteps count Ray Tune	1	387	April 18, 2023
Concept of `training_iteration` Ray Tune	8	861	February 14, 2023
Concept of trial and iteration Ray Tune	4	1400	March 14, 2022

"Iteration always 1" challenge

Related topics