How severe does this issue affect your experience of using Ray?
- None: Just asking a question out of curiosity
It seems that most of the use cases are related to batch/stream processing of incoming data either for training, online services or data processing (which is usually not necessarily real time).
I am exploring Ray for running inference at non-strict real-time speed on the edge (single Jetson). I need some multiprocessing where I can control CPU assignment and data sharing.
In specific I will be processing images incoming from a video in parallel (same image processed in different ways) and I will require some computation output as fast as possible to keep up with a certain frame rate.
Would Ray be able to keep up performances?
Keen to hear your thoughts.
@simone.zandara Thanks for the question! Definitely an interesting usecase.
Are you trying to process the images in a single host? Or if there are any multi-hosts (or cluster equivalent concepts)
Ray does have ray core’s APIs that are more generic and less ML specific, but tailored to general distributed computing.
But again, there is overheads comes with storage and computing with running ray.
Storage is not a huge deal, mostly I am worried about the computing overhead. Can you point me to some resources or code?
Resource management (such as GPU/CPU assignment) and multi threading is crucial for proper performances and Ray does seem to have few tooling to help with that.
Yea it would aim at processing images in a single host. I work with ML on the Edge with limited computing powers. The use case is close to robotics or autonomous devices such as cars. There are better suited tools for that such as GStreamer or Mediapipe, but overall they are pretty disruptive and more close to production readiness. We do a lot of Prototyping with Python instead.
Imagine maybe running one actor for ML detection in a Ray Actor while using the same image for Lane Following (maybe using classic computer vision). Just an example.
@simone.zandara Where are we this wrt resolution. Did you get your question answered from @rickyyx Seems like an interesting you case, but we not come across loads of Ray for edge devices.
As pointed Ray does have ray core’s APIs (actors and tasks) that are more generic and less ML specific, but tailored to general distributed computing. You you might be able to use Ray actors on the edge for single image reference.
Let us know how you progress. If this works, might be good to followup with a blog or lightning meetup talk.
Can we mark this as resolved?
If possible, could someone point at the aforementioned computational overhead of Ray with respect to task and actor scheduling?
So we have microbenchmark covering running ray on a single machine with empty tasks, which translates to around 0.5 - 0.8 ms overheads for executing a ray task. See this microbenchmark in particular: ray/ray_perf.py at b29890d1607378aa144d764635b5dc51bce17150 · ray-project/ray · GitHub
(This runs empty 1000 ray tasks sequentially and wait for each one to return, so it approximately reflects the overhead of running a single ray task in a single machine)
There are more benchmark results here (but not in the format that’s most readable by external users): ray/release/release_logs/2.3.0 at master · ray-project/ray · GitHub
Executing ray tasks in a distributed cluster has many more confounding factors, I don’t remember a test that produced direct result for this, but I could ask around.
Thanks a lot for the help. I will take a look at the material and maybe do some prototyping.
@simone.zandara Thanks for your questions. Yes, there is an amortizing over time with many ray tasks; small ray tasks does add the overhead, but more complicated and compute intensive tasks, past the initial small overhead of scheduling and dispatching them, you see the merits.
Related is also using too many small Ray tasks is considered an anti-pattern too.