Autoscaling Ray Service with KEDA

denisbchrsk · February 13, 2024, 10:08pm

How severe does this issue affect your experience of using Ray?

High: It blocks me to complete my task.

We’ve been looking at using the builtin autoscaling in KubeRay, however our use case is a bit more complicated as we use Ray Services which also consume from Apache Kafka (according to this post How can I integrate Apache Kafka with Ray Serve?) and as such, we want to autoscale based on different factors such as a certain lag threshold.

Therefore, we’ve been looking at KEDA to try to supplement that use case and scale accordingly, unfortunately it doesn’t seem as straightforward as we hoped it to be.

According to KEDA’s documentation on scaling CRDs, the only requirement is that the /scale subresource must be defined.

We thought that we might be able to change the RayService CRD and add the /scale subresource (according to Extend the Kubernetes API with CustomResourceDefinitions | Kubernetes), however it doesn’t seem to be possible - in RayService, we’ll need to scale the replicas of the relevant Ray Serve deployment, which its “num_replicas” field appears in serveConfigV2 - a string, which doesn’t seem to be possible to set the appropriate JSONPath in specReplicasPath.

Would appreciate help figuring this out

Topic		Replies	Views
Autoscaling RayServe Pods in k8s keeps terminating and restarting pods Ray Serve	4	706	November 20, 2023
Ray serve scale down strategy Ray Serve	3	493	February 24, 2022
Autoscaling Replicas in Ray Serve Ray Serve	5	1670	March 12, 2021
[Kuberay] Enabling/configuring autoscaling via kuberay-apiserver and/or ray-cluster Helm chart Kubernetes	1	554	January 20, 2023
Ray serve deployment is not scaling up, ongoing request is always 0 Ray Serve	1	307	April 18, 2024

Autoscaling Ray Service with KEDA

Related topics