Parallelizing rollout sampling and learning for SAC

Hello community,
I am working on a problem using SAC. So far, I am trying to optimize throughput. I am running on a single node with 1 GPU allocated for learning, and 32 cores. I am running 24 rollout workers with several environments each.
I am wondering whether it is possible to parallelize rollout sampling and learning. So far, I observe my CPU being busy for a while, while GPU is idle, then CPU idles down and GPU gets busy, and only when the learning steps are finished, rollout sampling continues.
It is possibly a natural consequence of encapsulating a single training iteration in the train() function, and I can’t seem to find a straight forward way to achieve this.

Can anyone offer some insight?

TL;DR: Is it possible with SAC to sample rollouts on the CPU while performing learning updates on the GPU, as opposed to doing it sequentially?