I still don’t get the gist how Rllib determines resource requirements.
For tune
I figured it out to the following formular:
num_samples * ( (num_workers * num_cpus_per_worker) + (num_workers * num_gpus_per_worker))
Now telling you that scenario for Rllib.
Feasible:
(Ok, with multiple raylet OOM warning messages, but it terminates with the final result message.)
if __name__ == "__main__":
ray.init(num_cpus=12, num_gpus=1)
config = (
ppo.PPOConfig()
.environment("CartPole-v1")
.rollouts(num_rollout_workers=2)
.resources(num_cpus_per_worker=6, num_gpus_per_worker=0.5)
.framework("tf2", eager_tracing=True)
)
algo = config.build()
algo.train()
print("One iteration done")
Infeasible:
(Turning into endless loop)
if __name__ == "__main__":
ray.init(num_cpus=12, num_gpus=1)
config = (
ppo.PPOConfig()
.environment("CartPole-v1")
.rollouts(num_rollout_workers=3)
.resources(num_cpus_per_worker=4, num_gpus_per_worker=0.3)
.framework("tf2", eager_tracing=True)
)
algo = config.build()
algo.train()
print("One iteration done")
What is the difference between (34,30.3) = (12,0.9) and (26,20.5)=(12,1) configuration?