Actor placement and execution resources

How severe does this issue affect your experience of using Ray?

  • None: Just asking a question out of curiosity

Hi there.

I just read the document about Ray architecture (link), in particular, a paragraph of Resource Management and Scheduling, which say an Actor will request 0 CPU for execution by default but request 1 CPUs for placement. As far as I understand, when creating an actor Ray reserves 1 CPUs for the actor. When executing a method of the actor Ray has to use 1 CPUs for execution. I am confused a little. What does it mean “an Actor will request 0 CPU for execution”?

@sangcho, could you leave some light here?

@yic, could you leave a comment here?

Hi, maybe the comment there is wrong. Basically, the current mechanism is as follow;

  1. When actor is scheduled, it requires 1 CPU
  2. Once actor is created, it requires 0 CPU.

So this ensures only up to num_cpus actors can be created at the same time, but you can eventually have unlimited # of actors if you don’t specify num_cpus. By schduling policy, actors are spread across the cluster by default

Hi @sangcho, thank you for your response. I am a little confused about your second point. Once actor is created, it requires 0 CPU. But neither a remote task nor another actor can’t be scheduled to the worker that already holds the actor created so I think it still requires 1 CPU. Isn’t it so?

But neither a remote task nor another actor can’t be scheduled to the worker that already holds the actor created

It is possible. I think the simple explanation is

  1. You can schedule only upto num_cpus=N actors at the same time.
  2. Once actors are created, you can create infinite # of actors on that node.

I mean that if a worker still holds an actor, we can’t schedule neither another actor nor a remote function to that worker. Is it right?

when you say “worker”, did you mean worker process?

you cannot create a new actor to a worker process that already initiated an actor. But ray can have more than num_cpus worker process (it is not a hard cap) per each node

when you say “worker”, did you mean worker process?

This is exactly what I mean.

you cannot create a new actor to a worker process that already initiated an actor. But ray can have more than num_cpus worker process (it is not a hard cap) per each node

Thanks, this makes sense to me.