Run PPO on multiple nodes

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hey! I’m trying to train PPO on a computationally expensive environment that needs to run on GPU.
I’m running a Ray cluster across multiple (e.g., 8) nodes with 8 GPUs each using Slurm.

How can I get PPO to use the resources available efficiently?

On a single node of 8 GPUs it’s straightforward to split resources across the driver and workers with:

  • num_workers (e.g., 35)
  • num_gpus (e.g., 1)
  • num_gpus_per_worker (e.g., 0.2)

But when trying to scale this to 8 nodes of 8 GPUs with the following:

  • num_workers = 8 * 35 = 280
  • num_gpus = 8 * 1 = 8
  • num_gpus_per_worker = 0.2

the GPUs are not used by workers anymore.

Can we run PPO on multiple nodes? What is the right way to set this up? Is DDPPO the only option or can vanilla PPO work across multiple nodes?

Hi @Theophile_Gervet ,

Yes, PPO has no notion of nodes. It is scheduled by Tune to a set of resources that it will fill with actors using up said resources. How do you create your cluster though? Because if you connector your nodes to your cluster head, you will have to make the resources available. Have you looked at the example that deals with this?
Sorry for the trouble and the late answer! I hope this fixes your issue.