RLlib perform worse when rollout_worker/env_runner increased?

I’m using Ray 2.8.1 (I’ve also tried the latest 2.38.0), and it seems like RLlib’s performance is worse when adding rollout workers (env_runners).

The above results were trained with 2.8.1 by the following script:

from ray import tune, train
from ray.rllib.algorithms.dqn import DQNConfig

config: DQNConfig = (
    DQNConfig()
    .environment("CartPole-v1")
    .rollouts(num_rollout_workers=0, num_envs_per_worker=8)
    .resources(num_gpus=1)
)

tuner = tune.Tuner(
    "DQN",
    param_space=config.to_dict(),
    run_config=train.RunConfig(
        "CartPole_Env_Parallel",
        checkpoint_config=train.CheckpointConfig(checkpoint_at_end=True),
        stop={
            "episode_reward_mean": 300
        }
    )
)

results = tuner.fit()
print(results.get_best_result())
  1. The black line was trained with num_rollout_workers=0, num_envs_per_worker=8, it reached 300 mean reward by 3minutes.
  2. The blue line was trained with num_rollout_workers=8, num_envs_per_worker=1, it reached the same mean reward by 9 minutes.

Is this expected?