How severe does this issue affect your experience of using Ray?
- Low: It annoys or frustrates me for a moment.
Hello, new user of Ray Server here. I am deploying my serve application using KubeRay and am confused as to why there is duplication between serveConfigV2
and rayClusterConfig
.
Concretely, in serveConfigV2
I set the number of replicas and the cpu/memory requirements for each deployment. Shouldn’t the rayClusterConfig
head and worker group specs be entirely determined by this? Why make users responsible for keeping these in sync / optimizing the bin packing?
Hi Steve! Welcome to the Ray community Steve~
So, the duplication between serveConfigV2
and rayClusterConfig
arises because they serve different purposes.
serveConfigV2
is used to configure the specifics of your Ray Serve deployments, such as the number of replicas and resource requirements for each deployment.
rayClusterConfig
is used to define the overall configuration of the Ray cluster, including the head and worker node specifications.
The reason for this separation is that Ray Serve operates on top of the Ray cluster, and while it can request resources based on deployment needs, the cluster itself must be configured to provide those resources.
This means that users need to ensure that the cluster has enough resources to meet the demands specified in serveConfigV2
. This setup allows for flexibility in resource allocation and scaling, but requires users to manage and optimize resource allocation between the deployment and the cluster configuration.
Here is the info I found from our docs:
1 Like
Hi @christina , thank you very much for the answer.
I am still not following one thing - why can’t the rayClusterConfig
resources be determined purely as a function of the serveConfigV2
? i.e. Since each Ray Serve deployment declares how many replicas it needs and what resources it needs, can’t all of that just be summed up to define the rayClusterConfig
? What’s the benefit of doing this manually versus having the cluster config be automatically determined based on what’s in serveConfigV2
?
Hi @Steve_McClain! I went to go ask the engineering team and this is what they said in response to your question:
- TLDR: Yes, you probably could auto-generate Ray Cluster compute resources based on serve cluster resources if you wanted to, but we haven’t chosen to do it.
- Ray cluster config is also responsible for setting up the pod shapes and pod. Actors don’t have to be 1:1 (and for performance reasons you might not want it to be 1:1 either).
Hopefully that helped clear some things up, let me know if there’s anything else that I could ask!