Learning rate annealing with tune.run()

Is there a way to set learning rate annealing with tune.run()?

The grid search below will run two training: one LR at 1e-5 and second at 1e-6. How do I, for example, have a LR schedule where at the end of 1000 iterations is LR is reduced from 1e-5 to 1e-6 and from 1e-6 to 1e-7? I’ve gone through the docs and can’t seem to find a solution.

config = {
  "env" : Env,
  "lr": grid_search([1e-5, 1e-6]),
  # etc
}

stop = {
  "training_iteration": 1000,
}

results = tune.run(
  "A3C",
  config=config, 
  stop=stop,
)

Hey @RickLan , there is a lr_schedule config key for A3C. Try the following:

config:
    lr_schedule: [[0, 1e-5], [1000000, 1e-6]],

The 1000000 is the (sampled) timesteps at which you would like the 1e-6 to be reached. RLlib will linearly decrease the learning rat from 1e-5 to 1e-6 and after 1Mio ts, stick with the 1e-6 as final value.

Thank you @sven1977
Is there a way to combine it with tune.grid_search()?

Yeah, you should be able to do something like this in your code:

config:
    lr_schedule: tune.grid_search([
        [[0, 0.01], [1e6, 0.00001]],
        [[0, 0.001], [1e9, 0.0005]],
    ]),

to test for two different schedules. Could you try and let us know, whether this was successful?

@sven1977 That’s very elegant. It works. However, it seems that the first learning rate is always 1e-4 (“lr” default is 1e-4.

I’m reading “ray/tune/info/learner/cur_lr” in tensorboard.
Using ray v1.2.0.

Test code:

import ray
from ray import tune

from ray.rllib.examples.env.random_env import RandomEnv

config = {
  "env": RandomEnv,
  "lr_schedule" : tune.grid_search([
    [[0, 1e-5], [25e3, 1e-6]],
    [[0, 1e-6], [25e3, 1e-7]],
  ]),
}

stop = {
  "training_iteration": 5,
}

ray.init()


results = tune.run(
  "A3C",
  name="test-lr_schedule",
  config=config, 
  stop=stop, 
)

ray.shutdown()

Yes, you are right, the very first lr used is the “lr” one ignoring the schedule, after that, we correctly switch to using the schedule. This is a small bug, that we should fix for clarity.

1 Like

This one here should fix the problem. We are checking this with a test as well now.

1 Like