I’m using ray serve for serving a classification model. Everything works fine, But is slow, compared to normal python script (the model needs to run on 1000’s of images in a directory).
I figured out it is because of the the http request that I’m using, is there a way we can send request as gRPC in ray serve ?
Hi @madgulasatish, gRPC doesn’t magically make the request go faster. To clarify the workload, are you sending all the image bytes via the http requests? Does the Ray Serve processes have access to those images? How are you encoding, decoding, compressing those images in flight?
As Simon pointed out above, gRPC might not be the root cause of slowness you observed, but other ray internals might. It will be helpful if you can provide a bit more context for your workload so we can help to identity and profile potential bottlenecks.
@ray.remote
def send_request(filepath):
img = cv2.imread(filepath, cv2.IMREAD_GRAYSCALE)
#preprocessing
return requests.get("http://localhost:9001/predict",
json={"array": img.tolist()})
resp = [send_request.remote(os.path.join(path,file)) for file in os.listdir(path)]
Currently Ray serve can access images, but not always.
I thought the issue is with HTTP request because, we were using TF-Serving with REST and then shifted to gRPC which increased the performance. (Comparison of gRPC and REST with TF-Serving)
gRPC may not be the issue as you said.
Is there any other way I can make the code efficient ?