We try to run machine learning training task on ray cluster. But our head pod is not GPU, and worker pods have 8 GPUs.
How to prevent head pod from computing?
We try to run machine learning training task on ray cluster. But our head pod is not GPU, and worker pods have 8 GPUs.
How to prevent head pod from computing?
Thanks your reply.
Is there other ways to set head noSchedule? For example, by default, kubernetes master node is NoSchedule.
The method suggested above was to decorate remote tasks with the gpu requirement:
@ray.remote(num_gpus=1)
Another method is to declare the head node as having 0 CPU. Ray tasks implicitly assume 1 CPU so no Ray tasks will run on the head node [unless you explicitly declare the task as requiring num_cpus=0].
To declare that the head pod has 0 CPU, you add --num-cpus = 0 to the head’s Ray start command. Alternatively, if you are using the Ray Kubernetes Operator, you can {“CPU”:0} to head podType
’s rayResources
field.
See also the discussion here:
Cool, thank you for your detailed response.