Ray Comparison with Flink

ivw · August 6, 2024, 6:15pm

I am considering using Ray for a new ML-data project and have used Flink in the past. For this project the team are heavy Python users (but more ML scientists than strong engineers). In the past I have used Flink-Scala, but I want to use a Python interface for this one. I have found a lot of comparisons of Ray with PySpark and Flink with Spark, but no Ray to PyFlink comparison.

For anyone familiar with both or has made similar decisions what were your deciding factors and learnings?

Generally, it seems like Flink’s watermark and replay abilities are the main distinguishing factors. Has anyone ever compared performance?

Sam_Chan · August 6, 2024, 8:50pm

@ivw check out our Ray Slack; I just did a search on it and there are folks from Uber, Apple, and others on the community there who are using Flink quite extensively and can answer your Q.

Topic		Replies	Views
I have a question for ray.data in realtime streaming process scenario Ray Data	5	791	August 22, 2024
Benchmarks for Ray Data? Ray Data	13	1033	October 5, 2023
What is the difference between Ray and Spark?	9	11660	March 5, 2025
Ray Tune with Spark ML models Ray Tune	3	561	May 24, 2022
Which API is preferred in production?	0	423	November 22, 2021

Ray Comparison with Flink

Related topics