Ray Serve with gRPC request

Hi,

I’m using ray serve for serving a classification model. Everything works fine, But is slow, compared to normal python script (the model needs to run on 1000’s of images in a directory).

I figured out it is because of the the http request that I’m using, is there a way we can send request as gRPC in ray serve ?

If so How can I do it ?

Hi @madgulasatish, gRPC doesn’t magically make the request go faster. To clarify the workload, are you sending all the image bytes via the http requests? Does the Ray Serve processes have access to those images? How are you encoding, decoding, compressing those images in flight?

2 Likes

As Simon pointed out above, gRPC might not be the root cause of slowness you observed, but other ray internals might. It will be helpful if you can provide a bit more context for your workload so we can help to identity and profile potential bottlenecks.

Hi @simon-mo ,

This is the code am using to send the requests

@ray.remote
def send_request(filepath):
    img = cv2.imread(filepath, cv2.IMREAD_GRAYSCALE)
    #preprocessing
    return requests.get("http://localhost:9001/predict",
                            json={"array": img.tolist()})
                            
                            
resp = [send_request.remote(os.path.join(path,file))  for file in os.listdir(path)]

Currently Ray serve can access images, but not always.

I thought the issue is with HTTP request because, we were using TF-Serving with REST and then shifted to gRPC which increased the performance. (Comparison of gRPC and REST with TF-Serving)

gRPC may not be the issue as you said.

Is there any other way I can make the code efficient ?

Hi we’ve created an RFC for gRPC in [Feature] RFC-Serve: Support gRPC · Issue #20854 · ray-project/ray · GitHub and open to community feedback with their examples, it will be great if you can comment there with your context so we can better gauge the demand for prioritization, thanks !