Limit number of steps?

Christian_Coletti · May 3, 2022, 1:41am

How severe does this issue affect your experience of using Ray?

Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Ray’s tune.run() is running endlessly. Can I force it to save a checkpoint and begin a new iteration after a number of steps?

analysis = tune.run(
    "PPO",
    stop={
        "episode_reward_mean": 2,
        "training_iteration": 35,
    },
    config={
        "env": "TradingEnv",
        "env_config": env_config_training,
        "log_level": "ERROR",
        #"log_level": "INFO",
        #"log_level": "DEBUG",
        "framework": "torch",
        "ignore_worker_failures": False,
        "clip_rewards": True,
        "lr": LR,
        "lr_schedule": [
            [0, 1e-1],
            [int(1e2), 1e-2],
            [int(1e3), 1e-3],
            [int(1e4), 1e-4],
            [int(1e5), 1e-5],
            [int(1e6), 1e-6],
            [int(1e7), 1e-7]
        ],
        "model": {
            "use_lstm": True,
            "lstm_cell_size": 512
        },
        "gamma": GAMMA,
        "observation_filter": "MeanStdFilter",
        "lambda": LAMBDA,
        "vf_share_layers": True,
        "vf_loss_coeff": VF_LOSS_COEFF,
        "entropy_coeff": ENTROPY_COEFF,
        "evaluation_interval": 1,  # Run evaluation on every iteration
        "evaluation_config": {
            "env_config": env_config_evaluation,  # The dictionary we built before (only the overriding keys to use in evaluation)
            "explore": False,  # We don't want to explore during evaluation. All actions have to be repeatable.
        },
    },
    metric=checkpoint_metric,
    mode="max",
    search_alg=search_alg,
    scheduler=scheduler,
    num_samples=10,  # Samples per hyperparameter combination. More averages out randomness. Less runs faster
    keep_checkpoints_num=10,  # Keep the last 10 checkpoints
    checkpoint_freq=1,  # Do a checkpoint on each iteration (slower but you can pick more finely the checkpoint to use later)
#    resume="AUTO",
    local_dir="./results",
    name=f"testing_{int(time.time()-1651400000)}",
    trial_name_creator=Methods.trial_name_string
)

xwjiang2010 · May 3, 2022, 4:06am

Does timesteps_total under stop work for you?

Christian_Coletti · May 5, 2022, 1:19am

I will try it. I’m surprised the option is not listed in documentation.
https://docs.ray.io/en/latest/tune/api_docs/stoppers.html
https://docs.ray.io/en/latest/tune/tutorials/tune-stopping.html

Christian_Coletti · May 5, 2022, 1:38am

It does not, for some reason tune.run() ignores it and continues training on multiple timesteps even when "timesteps_total": 1 is under stop.

analysis = tune.run("PPO",
    stop={
        "timesteps_total": 1,
    }, ...) # does not work

xwjiang2010 · May 5, 2022, 5:14pm

@arturn When you get a chance, could you hep @Christian_Coletti with this question?

arturn · May 6, 2022, 11:20am

Hey @Christian_Coletti, hey @xwjiang2010 ,

The option is not listed under the tune api docs, because it’s specific to RLLib. So timesteps_total is not a generic metric for everything tunable by tune. Every iteration, you can have a look at the output of the analysis and will find something like this …

...
timestamp: 1651834198
timesteps_since_restore: 0
timesteps_total: 12000
training_iteration: 3
trial_id: e577a_00000
...

… amongst many other metrics and infos. These are the ones that you can chose for the given algorithm. They vary per algorithm, but training_iteration or timesteps_total are obviously ubiquitous.

Regarding your experiment:
To reproduce I had to cut out a couple of config parameters and have used the following:

from ray import tune

analysis = tune.run(
    "PPO",
    stop={
        "timesteps_total": 10,
        "episode_reward_mean": 45,
        "training_iteration": 20,
    },
    config={
        "env": "CartPole-v0",
        "log_level": "ERROR",
        "framework": "torch",
        "ignore_worker_failures": False,
        "clip_rewards": True,
        "lr_schedule": [
            [0, 1e-1],
            [int(1e2), 1e-2],
            [int(1e3), 1e-3],
            [int(1e4), 1e-4],
            [int(1e5), 1e-5],
            [int(1e6), 1e-6],
            [int(1e7), 1e-7]
        ],
        "model": {
            "use_lstm": True,
            "lstm_cell_size": 512
        },
        "observation_filter": "MeanStdFilter",
        "evaluation_interval": 1,
    },
    mode="max",
    keep_checkpoints_num=10,
    checkpoint_freq=1,
    local_dir="~/ray_results/",
    name="test",
    resume=False,
)

This works on my side. The experiment stops after the first iteration, because the timesteps are exceeded. Similar logic works for episode_reward_mean or training_iteration.
Can you confirm or provide a script that does not provide the desired behaviour out of the box? Ideally with the version of ray your are using

Cheers

Christian_Coletti · May 6, 2022, 3:15pm

Hello,

My tune.run() is running endlessly and not beginning new iterations. I’m looking for a way to shorten an iteration and force it to save a checkpoint.

Since you said

I believe this does not answer my question

Thanks

arturn · May 7, 2022, 10:31pm

No problem! Is my script running endlessly on your side? Can you confirm or provide a script that does not provide the desired behaviour out of the box? Ideally with the version of ray your are using
Your script is not executable as is and removing the unknowns to run somethings similar does not yield your undesired outcome.

Is there maybe a misunderstanding? I understand that tune never ends the first training iteration and is therefore never able to checkpoint - since there is not result. Correct?
To shorten iterations you will have to go through RLlib which has traditionally used the "timesteps_per_iteration" and has now switched to "min_sample_timesteps_per_reporting" and "min_train_timesteps_per_reporting" incase you are working with a nightly build.

Best

Topic		Replies	Views
Trouble with some results from Ray Tune Ray Libraries (Data, Train, Tune, Serve)	1	33	August 7, 2024
Saving best checkpoint - tune is saving first iterations instead Ray Tune	1	490	October 18, 2021
[Tune FAQ] Formula for estimating iteration & timesteps count Ray Tune	1	381	April 18, 2023
Tune.run() doesn't work. runs endlessly Ray Tune stopping condition & comparisons	1	479	November 2, 2023
My training is endless wth tune.run() Ray Tune	8	485	May 3, 2022

Limit number of steps?

Related topics