Relationship between Ray Workers and trials and CPUs

1. Severity of the issue: (select one)
None: I’m just curious or want clarification.
Low: Annoying but doesn’t hinder my work.
Medium: Significantly affects my productivity but can find a workaround.
High: Completely blocks me.

2. Environment:

  • Ray version: 2.43.0
  • Python version: 3.11.0
  • OS: Ubuntu 22.04.4 LTS
  • Cloud/Infrastructure: Azure
  • Other libs/tools (if relevant): Ray on Databricks Spark

3. Question:
Assumptions (correct me if I am wrong about any of these):

  • Each trial will by default be allocated 1 CPU.
  • Each trial corresponds to a Ray task.
  • Each Ray task is scheduled on a Ray Worker that has sufficient resources for it.
  • Each Ray Worker only runs one Ray task at a time.

Given the above, assume I have:

  • 8 Ray Workers (4 CPUs each)
  • 50 trials to be scheduled with default resource usage and no max_concurrency

With all of that, am I correct that the following will occur?

  • 8 trials will be started on the 8 Ray Workers at first
  • These running trials would be using 8 CPUs (1 per trial, as is default), with the other 24 CPUs being used for other Ray processes or idling
  • When one of the trials finishes, another will be scheduled so that at all times, as long as there are more trials to schedule, there will be 8 concurrent trials?

The reason I am confused is because I ran a trial with this configuration and found that there were 22 trials in the RUNNING state at some point, but I do not understand how 22 of them could be running at the same time.

Hey @ishaan-mehta, what do you mean by “Ray Worker” in this context? Are you referring to worker nodes?

Hi @matthewdeng, thanks for the response! Correct, I mean worker nodes. Am just used to saying “Ray Worker” to differentiate between Spark worker nodes and Ray worker nodes (as I am running Ray on a Spark cluster), apologies for the confusion.

Got it! In that case this assumption is incorrect:

  • Each Ray Worker only runs one Ray task at a time.

Each worker node will be able to schedule as many tasks as logical CPU resources are available for scheduling. In this case, on a cluster setup with 4 worker nodes with 8 CPUs each there can be up to 32 trials running at once.

Ohhh, OK – thank you for the clarification! In that case, what relationships and limitations exist w.r.t. Ray worker nodes and Ray Tune trials? Is there any reason to increase or decrease the number of worker nodes (keeping the Spark cluster size constant) when only using Ray Tune? Apologies if this is a silly question, just trying to wrap my head around things.

When you say keeping the Spark cluster size constant, what are you defining as cluster size?

For example, if your baseline is 4x8CPU nodes:

  • If you change this to 1x32CPU node (different node type, same number of CPUs), the time to run the Tune experiment should be approximately the same.
  • If you change this to 1x8CPU node (same node type, different number of nodes), the time to run the Tune experiment should be approximately 1/4.

Hey @matthewdeng — good question. When I say cluster size is constant, I mean the same number of Spark nodes and the same number of CPUs per Spark node.

So if I have 4x8CPU Spark worker nodes, is there any reason to have 2 Ray worker nodes per Spark worker node (so that each of the 8 total Ray worker nodes has 4 CPUs) as opposed to having a 1:1 relationship between Spark and Ray worker nodes (so that each of the 4 total Ray worker nodes has 8 CPUs)?

I guess I am just wondering what the pros and cons are of manipulating the ratio of Spark workers to Ray workers if the efficiency is purely based on CPU availability rather than Ray worker count.

I would recommend keeping them 1:1 and only adjust if you run into any issues with it that would require changing the ratio (though nothing stands out as obvious to me at the moment).

1 Like