Hello everyone! I’m excited to share what we have in plan for Q3 2025 for Ray. I will try to keep this updated as features get merged in, and rolled out.
Goal: Deliver foundational reliability, performance, and DX improvements across Ray Core, Data, Train, LLM, Serve, RL, Observability, Technical Content, and KubeRay.
Ray Core
Reliability & Fault Tolerance
-
Improve system stability under node and network failures,
- Including making RPCs tolerant to transient errors
-
Add robust support for preemptible instances
Scheduling & Performance
-
Introduce label-based scheduling for finer-grained resource control.
-
Implement GPU objects with RDMA transfer support for high-performance GPU data handling
Developer Experience
-
Introduce ActorMesh for simplified interaction with groups of actors
-
Improve static typing across the codebase to enhance developer productivity
-
Address outstanding technical debt in core worker components
Ecosystem Integrations
Ray Data
Reliability
- Ensure workloads complete successfully despite cluster failures
Performance
- Enhance training ingest pipelines with advanced sampling and caching support
Connectors
-
Improve Apache Iceberg integration
-
Expand data catalog support, starting with Databricks Unity Catalog
Usability
-
Schema UDFs
-
Enhanced internal query planning
Ray Train
API
- Finalize Train v2 API
Performance
- Implement asynchronous checkpointing
LLM
Goal: Run large models (e.g. DeepSeek) at scale via vLLM on Ray Serve:
-
Prefill diaggregation
-
Large scale DP
-
Custom request routing
-
Elastic expert parallelism
Performance & Efficiency
-
Implement prefill disaggregation to optimize performance for large context models.
-
Develop an intelligent, KV cache-aware router with a pluggable architecture
-
Implement Distributed Parallel (DP) Attention within Ray Serve
Operations
- Publish updated performance benchmarks
Ecosystem
- Support SkyRL for reinforcement learning for human feedback (RLHF) workloads
Ray Serve
Serving Flexibility
-
Custom auto‑scaling and routing patterns
-
Async inference support
-
MCP server patterns
-
Integrate label based scheduling
Observability
- Enhanced tracing support
RLlib
-
Ray RL V2 stack GA
-
Algorithm composability enhancements
Observability
API Release
- Public launch of unified event export API
Optimization
- Refactor internals to leverage new export API
Technical Content
-
New technical templates
-
More examples & deep‑dives
KubeRay
Upgrades
- Productionize the incremental upgrade feature for seamless cluster updates
Hardware Support
- Streamline support for diverse accelerators, including multiple GPU types, Dynamic Resource Allocation (DRA), and MIG
Autoscaling
- Continue to improve the functionality and reliability of Autoscaler V2
We love hearing from the community! If there is a feature you’d like to see in Ray in the future, let us know by filing a feature request or comment here. We also have a discussion on the roadmap on GitHub if you prefer to chat there.
Thank you for supporting Ray!