Best Way to Pipeline Serve App

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I’m creating an app to process images through various models and trying to find the best approach to handle this using Ray. I’m looking for something with the best performance. The basic pipeline looks like:

  1. Read in image
  2. Preprocessing step (cpu)
  3. Run Model (gpu)
  4. Post Processing step (cpu)
  5. Write result

The options I see are:

What is the scale and SLA you are working with here?

10-50 frame videos with several hundred videos concurrently processing. Priority is processing speed however initial startup delays are ok. Second, I want to make sure the gpu is being fully saturated, reading and processing the inputs should be as fast as possible. And finally some fault tolerance would be ideal.

I have a fast working pipeline using Datasets but the map method can be somewhat limiting and maping multiple Classes adds a noticeable delay to each deployment call (not just an initial startup delay). I did a small test with just ray.remote functions which had comparable performance to the Dataset however it seems to run into more crashes (possibly an implementation issue on my end) which is problematic if a few frames don’t get processed in a video.

It seems like the new workflows is just ray.remote but with better fault tolerance?