How does rllib parallelise gradient computation and updating?

Alberto_Mordonini · September 27, 2023, 4:01pm

Apologies for raising a question that may seem trivial to many users, but I still find myself unclear on how RLlib facilitates the parallelization of gradient computations.

I understand there are numerous Rollout workers employed to collate experiences, which subsequently allow the Learner (or RLModule) to sample for gradient optimization.

My curiosity lies in the functioning of RLlib within a clustered environment, specifically in terms of handling gradient computation and aggregation across learners.

Am I right in summarizing the process as follows?

The model (Neural Network) is duplicated and distributed to each Learner (if num_learner_workers > 0).

Every learner receives experiences from the rollout workers for sampling.
Post sampling, the learners calculate the gradient optimization and send it to the head node, whose task is to gather the aggregated data.
Based on the aggregation strategy (whether synchronous or asynchronous), the head node proceeds to update the main model weights, according with the gradients received from the learners.
Finally, every Learner’s model is synchronized with the main one.

Could you please confirm if it’s accurate? I would greatly appreciate any guidance or suggestions.

Topic		Replies	Views
RLLib Parallelization Guide Configure Algorithm, Training, Evaluation, Scaling	0	29	July 29, 2024
Training parallelisation in RLLIB Configure Algorithm, Training, Evaluation, Scaling	3	579	December 9, 2022
[RLlib] GPU performance in rollout.py RLlib	2	495	March 31, 2021
How to use rllib to conduct distributed training on multiple machines at the same time Configure Algorithm, Training, Evaluation, Scaling	5	724	February 20, 2023
Reserve workers on GPU node for trainer workers only RLlib	7	1105	June 3, 2022

How does rllib parallelise gradient computation and updating?

Related topics