Why ddppo use multi gpus and cpus not faster in training

I use ddppo, experiment on three phae specify resource 1gpus & 10cpus per woker, upper to double resources and 4x resources.
env is CartPole-v1
but the training speed not faster.

the training figure is

generally speaking with ddppo training speedup will not necessarily be linear.

But also I think cartpole is a bad example to explain the speedups that you can get with DDPO