1. Severity of the issue: (select one)
None: I’m just curious or want clarification.
Low: Annoying but doesn’t hinder my work.
Medium: Significantly affects my productivity but can find a workaround.
High: Completely blocks me.
2. Environment:
- Ray version: 2.54
- Python version: 3.12
- OS: Ubuntu 24.04
- Cloud/Infrastructure: Single docker container
- Other libs/tools (if relevant): FastAPI based app
3. What happened vs. what you expected:
- Expected: I would like to load machine learning models from a local model registry (a local folder), that is mounted as a docker volume.
- Actual: The ray cluster creates a copy of the data in the model registry, which are quite heavy.
I am trying to deploy a Ray serve (in combination with FastAPI) in a single container. I need to run ML inferences with many different models, hence I am taking advantage of the multiplexed models feature. The code that loads the models in the multiplexed bit, simply reads the (onnx) model from a local path, which would be my model registry (I do not have a separate service as a Model Registry, just a simple local folder). The model registry is mounted as a volume in the Docker container, since I do not want to include it at build time, given its size. However, when the entrypoint launches the Ray cluster, the model registry is copied and causes issues with memory and disk.
Is there a way to ignore the folder (some sort of rayignore), but at the same time being able to read from the folder?
Thanks,
Marco