How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task as this needs to go in production soon.
Hello,
Initially I was using the image - rayproject/ray:2.39.0
and runtime_envs for deployment.
Now, I am bundling my ml model using mlflow pyfunc. Hence I needed to create a custom dockerfile to load my model. My dockerfile looks like this -
FROM rayproject/ray:2.39.0
RUN pip install mlflow==2.15.1
COPY utils /utils
COPY ray-serve-app /serve-app
RUN sudo mkdir -p /tmp/model && sudo chmod -R 777 /tmp/model && python /utils/download_model_artifacts.py ${MODEL_NAME} ${MODEL_VERSION}
RUN sed -i.bak -E "s|(https://.*:).*(@.*)|\1${AWS_TOKEN}\2|" "/tmp/model/conda.yaml"
RUN conda env update --name base -f /tmp/model/conda.yaml
WORKDIR /serve-app
RUN pip install ray[serve]==2.39.0
My deployment latencies per request have jumped from 2 ms to almost 15 ms when I use this image for my ray serve deployment. Can you tell what I’m doing wrong or where I need to be looking for the root cause?