Ray Serve - Observing high latencies when using custom docker image

Tarun_Dugar1 · December 11, 2024, 3:23am

How severe does this issue affect your experience of using Ray?

High: It blocks me to complete my task as this needs to go in production soon.

Hello,

Initially I was using the image - rayproject/ray:2.39.0 and runtime_envs for deployment.

Now, I am bundling my ml model using mlflow pyfunc. Hence I needed to create a custom dockerfile to load my model. My dockerfile looks like this -

FROM rayproject/ray:2.39.0

RUN pip install mlflow==2.15.1

COPY utils /utils
COPY ray-serve-app /serve-app

RUN sudo mkdir -p /tmp/model && sudo chmod -R 777 /tmp/model && python /utils/download_model_artifacts.py ${MODEL_NAME} ${MODEL_VERSION}

RUN sed -i.bak -E "s|(https://.*:).*(@.*)|\1${AWS_TOKEN}\2|" "/tmp/model/conda.yaml"

RUN conda env update --name base -f /tmp/model/conda.yaml

WORKDIR /serve-app

RUN pip install ray[serve]==2.39.0

My deployment latencies per request have jumped from 2 ms to almost 15 ms when I use this image for my ray serve deployment. Can you tell what I’m doing wrong or where I need to be looking for the root cause?

Topic		Replies	Views
Trying to deploy ray with docker Ray Serve	2	4045	February 16, 2021
Best Way to Pipeline Serve App Ray Serve	3	90	November 21, 2024
Ray serve run in Docker - IOError: No such file or directory Ray Serve	0	184	June 3, 2024
Utilising Ray for Simple Parallelism (Batch Inference)	1	920	March 28, 2023
When I run ray.remote with image_uri as the parameter, it is successful, but when I change it to ray_derve, it cannot run successfully	0	20	February 23, 2025

Ray Serve - Observing high latencies when using custom docker image

Related topics