1. Severity of the issue: (select one)
Low: Annoying but doesn’t hinder my work.
2. Environment:
- Ray version: 2.47.1
- Python version: 3.11
- OS: Linux
In the source code of SAC there are two different sets of learning rates (actor, critic and alpha) defined. Even their initial values are different and the user can assign different values to them through the config. Why is that and which set is the actual learning rates being used? Is one of the sets deprecated?
The code below is copied from this link:
https://docs.ray.io/en/latest/_modules/ray/rllib/algorithms/sac/sac.html
self.optimization = {
"actor_learning_rate": 3e-4,
"critic_learning_rate": 3e-4,
"entropy_learning_rate": 3e-4,
}
self.actor_lr = 3e-5
self.critic_lr = 3e-4
self.alpha_lr = 3e-4