Has anyone tried integrating Kuberay with DeepSpeed? Seeking advice on Kubernetes-based distributed training

yuanfan · June 24, 2025, 4:17pm

Hi Ray community,

I’m exploring the integration of Kuberay (Ray on Kubernetes) with DeepSpeed for large-scale distributed model training, but noticed a significant gap: While Kuberay + vLLM workflows are well-documented and mature (e.g., for high-throughput inference with autoscaling and multi-GPU support38), DeepSpeed integrations seem almost nonexistent. Is combining them a viable idea? Or could anyone share experiences/advice on co-deploying Ray and DeepSpeed in Kubernetes?

Thanks

Topic		Replies	Views
Ray tune + deepspeed integration Ray Tune	1	46	February 21, 2025
Example using RLLIB via KubeRay Kubernetes	1	117	May 30, 2024
Using slurm and ray Ray Train	0	361	September 12, 2022
Model Parallelism in Ray Ray Train	9	3050	November 18, 2023
Python shell is killed while running fine tuning models Kubernetes	2	762	June 5, 2023

Has anyone tried integrating Kuberay with DeepSpeed? Seeking advice on Kubernetes-based distributed training

Related topics