I’m not sure if this is a bug.
I have 4 machines with two gpu. At the same time, I have 5 tasks that require [2,2,2,1,1] gpu.I have set num_gpus.
The situation I envision is that three machines deploy two-gpu tasks, and one machine deploys two one-gpu tasks.
However, in fact, this allocation seems to be random. Sometimes it is allocated according to the scenario I expected. Sometimes two one-gpu tasks are distributed on two machines, and a two-gpu task is also allocated on two machines.
During initialization, you can see that two one-gpu tasks are distributed on two machines.
Because of the problem below, there are no pictures here. I’ll see if I can add them in the comment area.
Sorry, new users can only put one embedded media item in a post.
When I call a two-gpu GPU infer task, I can see that the task is running on two machines.
Infer is much slower due to the need for communication between machines
So, can ray assign actors to specific machines? Or is there any other way? I’m looking forward to the response