What is the cleanest way to train an off-policy algorithm (e.g. SAC) using the sample batches collected by a RolloutWorker.sample()?

I was wondering what is the easiest way to train an off-policy algorithm (e.g. SAC) using the sample batches collected by a RolloutWorker.sample()?

for n= num_iters:
samples = worker.sample()
use the samples to train a SAC policy ?
get the critic network
End for

All RLlib algorithms sample by default already.
Have a look at SAC.
It inherits from DQN, which has a training_step function that does what you describe.