I have getting the following warning
“(autoscaler +18m43s) Warning: The following resource request cannot be scheduled right now: {‘CPU’: 1.0}. This is likely due to all cluster resources being claimed by actors. Consider creating fewer actors or adding more nodes to this Ray cluster.”
My ray cluster have following setting
"setup_ray_cluster(
num_worker_nodes=6,
num_cpus_worker_node=4,
num_gpus_worker_node=1,
num_cpus_head_node= 4,
num_gpus_head_node= 1)"
and RayTrainer have =ScalingConfig(num_workers=6, trainer_resources={“CPU”: 4}, use_gpu=True, resources_per_worker={“CPU”: 4, “GPU”: 1})
How can I solve this?
Also, where does the ray training coordinator is running? How is the head node different from the node where trainer coordinator runs?