I have a question for ray.data in realtime streaming process scenario

Hi everyone
I have been following the development of Ray and using it to solve problems in my work.
About 2 years ago, ray cooperated with ant group, tried to develop ray streaming like flink for realtime streaming process scenario. At that time, I was hoping that one day I could use Ray for real-time computing.
However, about 1 year ago, this development (ray streaming) was separated from ray to become mobius. And this project has not been updated for one year. The development plan of that ray streaming (which was stored in the Google document) has also disappeared.
Recently, I m trying to use ray for realtime streaming process scenario and I wanted to upgrade ray.data to fit what I want.
I would like to ask if there are some insurmountable problems that Ray will encounter in realtime streaming process scenario, which leads to giving up this path. If so, I will also give up this path.
If possible, I hope someone can tell me the reasons why I have to give up.

Thanks :blush:

@kyoka_gong We are a bit all heads down. for Ray Summit coming next week. We replies/responses will be delayed until after next week.

cc; @ericl @chengsu

thx, jules.

I m also expecting ray submit :blush:

Hi. Jules, is there any updates here?

Well it’s been a year; can you answer your own question with your learnings? I for one am very interested in the comparison/choice between Ray and Flink.

For what it’s worth, Ray Data now natively uses streaming execution for datasets. You can read more about it in our docs:

One major difference between Ray and Flink is that Ray does not currently support unbounded data streams. Ray Data is more suitable for mixed CPU+GPU workloads, as we can take advantage of heterogeneous clusters.