[Core] Question on optimizing machine learning project speed using ray

Eddie-kindergarden · May 6, 2021, 1:17pm

Hi all, I am currently optimizing the executing speed of a machine learning project. The main thread includes two parts, one is running the inference on GPU using Pytorch lightning framework, the other is image processing on the batch output. To best utilize CPU and GPU device, I use a queue to cache the output from GPU batch output and use a thread as a background worker to handle the image processing tasks from the queue so that the GPU could run independently. However this kind of design will meet the python GIL issue, the image processing thread has to wait for the main thread sometimes and slow down total inference speed. Wonder if there is a better design format using ray actors or some other mechanism. So many thanks in advance!

simon-mo · May 6, 2021, 6:08pm

For this use case you can use two Ray actors, one actor for image process and the other for inference.

Eddie-kindergarden · May 7, 2021, 12:31pm

@simon-mo Thanks so much, one problem I concerned is where to initialize the second actor? Do you mean I could initialize an actor inside the pytorch lightning predict step?

simon-mo · May 7, 2021, 5:50pm

Ideally you would put image processing in one actor, and wrap the pytorch lightning in another actor. Then you code can ship the data around:

single_out_ref = predictor.remote(process.remote(input))

Eddie-kindergarden · May 8, 2021, 6:51am

Thanks so much, I am going to implement it.

Clark_Zinzow · February 1, 2022, 4:31pm

Hi @Eddie-kindergarden, I stumbled upon this question and was wondering if you had tried out Ray Datasets? It supports this use case pretty well; we actually have an example for pipelined parallel batch inference in our docs!

Topic		Replies	Views
Increase efficiency using PyTorch + GPU for inference Ray Core	1	732	July 17, 2022
Ray inferencing not happening in streaming way	7	388	December 13, 2023
Using ray with transformers pipeline for inference Ray Core	0	508	August 19, 2021
Correctly sizing preprocessing Actor in Ray data Ray Data	3	82	June 26, 2024
Too many workers Ray Core	10	563	June 13, 2022

[Core] Question on optimizing machine learning project speed using ray

Related topics