Learning rate annealing with tune.run()

RickLan · April 27, 2021, 6:12am

Is there a way to set learning rate annealing with tune.run()?

The grid search below will run two training: one LR at 1e-5 and second at 1e-6. How do I, for example, have a LR schedule where at the end of 1000 iterations is LR is reduced from 1e-5 to 1e-6 and from 1e-6 to 1e-7? I’ve gone through the docs and can’t seem to find a solution.

config = {
  "env" : Env,
  "lr": grid_search([1e-5, 1e-6]),
  # etc
}

stop = {
  "training_iteration": 1000,
}

results = tune.run(
  "A3C",
  config=config, 
  stop=stop,
)

sven1977 · April 27, 2021, 7:19am

Hey @RickLan , there is a lr_schedule config key for A3C. Try the following:

config:
    lr_schedule: [[0, 1e-5], [1000000, 1e-6]],

The 1000000 is the (sampled) timesteps at which you would like the 1e-6 to be reached. RLlib will linearly decrease the learning rat from 1e-5 to 1e-6 and after 1Mio ts, stick with the 1e-6 as final value.

RickLan · April 27, 2021, 9:15am

Thank you @sven1977
Is there a way to combine it with tune.grid_search()?

sven1977 · April 27, 2021, 9:40am

Yeah, you should be able to do something like this in your code:

config:
    lr_schedule: tune.grid_search([
        [[0, 0.01], [1e6, 0.00001]],
        [[0, 0.001], [1e9, 0.0005]],
    ]),

to test for two different schedules. Could you try and let us know, whether this was successful?

RickLan · April 27, 2021, 10:44am

@sven1977 That’s very elegant. It works. However, it seems that the first learning rate is always 1e-4 (“lr” default is 1e-4.

I’m reading “ray/tune/info/learner/cur_lr” in tensorboard.
Using ray v1.2.0.

Test code:

import ray
from ray import tune

from ray.rllib.examples.env.random_env import RandomEnv

config = {
  "env": RandomEnv,
  "lr_schedule" : tune.grid_search([
    [[0, 1e-5], [25e3, 1e-6]],
    [[0, 1e-6], [25e3, 1e-7]],
  ]),
}

stop = {
  "training_iteration": 5,
}

ray.init()


results = tune.run(
  "A3C",
  name="test-lr_schedule",
  config=config, 
  stop=stop, 
)

ray.shutdown()

sven1977 · April 27, 2021, 11:48am

Yes, you are right, the very first lr used is the “lr” one ignoring the schedule, after that, we correctly switch to using the schedule. This is a small bug, that we should fix for clarity.

sven1977 · April 27, 2021, 12:06pm

This one here should fix the problem. We are checking this with a test as well now.

github.com/ray-project/ray

[RLlib] Discussion 1928: Initial lr wrong if schedule used that includes ts=0 (both tf and torch).

ray-project:master ← sven1977:discussion_1928_initial_lr_wrong_when_using_schedule_with_0

opened 12:01PM - 27 Apr 21 UTC

sven1977

+35 -20

Discussion 1928: Initial lr wrong if schedule used that includes ts=0 (both tf… and torch). - The impala test was modified to check for this bug. ## Why are these changes needed? ## Related issue number ## Checks - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [x] Unit tests - [ ] Release tests - [ ] This PR is not tested :(

Topic		Replies	Views
Support for annealing gamma RLlib	2	492	May 20, 2021
Practical advice for RLlib hyperparameter tuning RLlib	1	404	September 12, 2022
PPO lr_schedule not working RLlib	4	975	February 15, 2022
Change learning rete for DQN RLlib	6	504	February 25, 2022
How to interactively stop running trials Ray Tune	2	471	May 30, 2023

Learning rate annealing with tune.run()

Related topics