hi,
i am using ray 2.8.1 with a single agent rl environment, torch ModelV2 and PPO algorithm. my problem is the following:
I can specify resources (cpu’s in this case) that ray is allowed to use in ray.init().
I can specify resources in the PPOConfig, and if I understand correctly, that specifies the resources per trial, e.g. num_cpus_per_worker and num_cpus_local_worker.
If I run a tune.Tuner(…).fit(), the resources specifications are respected, f.e. I set ray.init(num_cpus=12), num_cpus_for_local_worker=4, num_rollout_workers=0, it runs, as expected, 3 trials in parallel. Another example, for ray.init(num_cpus=12), num_cpus_for_local_worker=1, num_rollout_workers=1, num_cpus_per_worker=1, it runs 6 trials in parallel.
when I inspect the cpu usage with htop though, all trials are executet on the same two physical cores, splitting up the cpu% between each other (see picture)
How can I configure ray tune to distribute the load on all available physical resources? or is this something I have to handle with the cluster people?
This is all configuration I’m doing, all the config_files are related to the application itself. let me know if I should provide more info.
ray.init(num_cpus=12)
tune.register_env("CommunicationV1_env", lambda env_config: CommunicationV1_env(env_config))
tunable_model_config = ...
model = {"custom_model": GNN_PyG,
"custom_model_config": tunable_model_config}
# ppo config
ppo_config = (
PPOConfig()
.environment(
"CommunicationV1_env", # @todo: need to build wrapper
env_config=env_config)
.training(
model=model,
_enable_learner_api=False,
)
.rollouts(num_rollout_workers=0)
.resources(
num_cpus_per_worker=2,
num_cpus_for_local_worker=4,
placement_strategy="PACK",
)
.rl_module(_enable_rl_module_api=False)
)
# run and checkpoint config
run_config = air.RunConfig(
name=run_name,
stop={"timesteps_total": tune_config["max_timesteps"]}
)
# tune config
tune_config = tune.TuneConfig(
num_samples=tune_config["num_samples"]
)
tuner = tune.Tuner(
"PPO",
run_config=run_config,
tune_config=tune_config,
param_space=ppo_config.to_dict()
)
tuner.fit()