All ray resources mapped to only two physical processors

hi,

i am using ray 2.8.1 with a single agent rl environment, torch ModelV2 and PPO algorithm. my problem is the following:
I can specify resources (cpu’s in this case) that ray is allowed to use in ray.init().
I can specify resources in the PPOConfig, and if I understand correctly, that specifies the resources per trial, e.g. num_cpus_per_worker and num_cpus_local_worker.
If I run a tune.Tuner(…).fit(), the resources specifications are respected, f.e. I set ray.init(num_cpus=12), num_cpus_for_local_worker=4, num_rollout_workers=0, it runs, as expected, 3 trials in parallel. Another example, for ray.init(num_cpus=12), num_cpus_for_local_worker=1, num_rollout_workers=1, num_cpus_per_worker=1, it runs 6 trials in parallel.
when I inspect the cpu usage with htop though, all trials are executet on the same two physical cores, splitting up the cpu% between each other (see picture)

How can I configure ray tune to distribute the load on all available physical resources? or is this something I have to handle with the cluster people?

This is all configuration I’m doing, all the config_files are related to the application itself. let me know if I should provide more info.

    ray.init(num_cpus=12)
    
    tune.register_env("CommunicationV1_env", lambda env_config: CommunicationV1_env(env_config))
    tunable_model_config = ...
    model = {"custom_model": GNN_PyG,
            "custom_model_config": tunable_model_config}

    # ppo config
    ppo_config = (
        PPOConfig()
        .environment(
            "CommunicationV1_env", # @todo: need to build wrapper
            env_config=env_config)
        .training(
            model=model,
            _enable_learner_api=False,
        )
        .rollouts(num_rollout_workers=0)
        .resources(
            num_cpus_per_worker=2,
            num_cpus_for_local_worker=4,
            placement_strategy="PACK",
        )
        .rl_module(_enable_rl_module_api=False)
    )

    # run and checkpoint config
    run_config = air.RunConfig(
        name=run_name,
        stop={"timesteps_total": tune_config["max_timesteps"]}
    )

    # tune config
    tune_config = tune.TuneConfig(
            num_samples=tune_config["num_samples"]
        )

    tuner = tune.Tuner(
        "PPO",
        run_config=run_config,
        tune_config=tune_config,
        param_space=ppo_config.to_dict()
    )

    tuner.fit()