SAC: two different sets of learning rates?

cmy28 · June 19, 2025, 7:29pm

1. Severity of the issue: (select one)
Low: Annoying but doesn’t hinder my work.

2. Environment:

Ray version: 2.47.1
Python version: 3.11
OS: Linux

In the source code of SAC there are two different sets of learning rates (actor, critic and alpha) defined. Even their initial values are different and the user can assign different values to them through the config. Why is that and which set is the actual learning rates being used? Is one of the sets deprecated?

The code below is copied from this link:
https://docs.ray.io/en/latest/_modules/ray/rllib/algorithms/sac/sac.html

self.optimization = {
            "actor_learning_rate": 3e-4,
            "critic_learning_rate": 3e-4,
            "entropy_learning_rate": 3e-4,
        }
self.actor_lr = 3e-5
self.critic_lr = 3e-4
self.alpha_lr = 3e-4

christina · June 24, 2025, 4:17pm

Hello! There was a RLLib migration so one is used for the new stack and one is the old stack. It is ultimately up to you which one you decide to use, but I suggest moving to the new stack if possible.

actor_learning_rate will only be used if you use the old stack.
actor_lr will only be used if you use the new stack.

I went to ask our engineering team, who also clarified that “our main learning rate attribute is named lr and so we wanted to adhere to this naming for other learning rates”.

Since the old stack is going to be deprecated later this year, I recommend using the new stack. The ambiguities should likely be gone by then too. Thanks for the question!

Topic		Replies	Views
Learning rate in SAC RLlib	3	778	April 13, 2022
The hyperparameters for SAC to solve “CartPole-v0” RLlib	3	912	February 10, 2022
the hyperparameters for SAC to solve “CartPole-v0” RLlib	4	772	February 8, 2022
What part of the SAC algorithm "update target" is being executed? RLlib	0	175	July 24, 2023
Training with pre-trained actor and critic using SAC is too slow Configure Algorithm, Training, Evaluation, Scaling	0	344	June 29, 2023

SAC: two different sets of learning rates?

Related topics