Parallelizing rollout sampling and learning for SAC

ffnyboe · June 7, 2025, 9:52am

Hello community,
I am working on a problem using SAC. So far, I am trying to optimize throughput. I am running on a single node with 1 GPU allocated for learning, and 32 cores. I am running 24 rollout workers with several environments each.
I am wondering whether it is possible to parallelize rollout sampling and learning. So far, I observe my CPU being busy for a while, while GPU is idle, then CPU idles down and GPU gets busy, and only when the learning steps are finished, rollout sampling continues.
It is possibly a natural consequence of encapsulating a single training iteration in the train() function, and I can’t seem to find a straight forward way to achieve this.

Can anyone offer some insight?

TL;DR: Is it possible with SAC to sample rollouts on the CPU while performing learning updates on the GPU, as opposed to doing it sequentially?

Topic		Replies	Views
Training with pre-trained actor and critic using SAC is too slow Configure Algorithm, Training, Evaluation, Scaling	0	349	June 29, 2023
Ray not scaling over multiple GPU in the same node Configure Algorithm, Training, Evaluation, Scaling	0	97	March 29, 2024
Does rllib support multi-gpu plus multi-cpu training? Configure Algorithm, Training, Evaluation, Scaling	2	705	March 29, 2024
Num_gpu, rollout_workers, learner_workers, evaluation_workers purpose + resource allocation Configure Algorithm, Training, Evaluation, Scaling	8	2170	August 24, 2023
Multiagent to batch learn and update a single policy RLlib	0	298	December 1, 2020

Parallelizing rollout sampling and learning for SAC

Related topics