Load models from Docker volume without creating copies

1. Severity of the issue: (select one)
None: I’m just curious or want clarification.
Low: Annoying but doesn’t hinder my work.
Medium: Significantly affects my productivity but can find a workaround.
High: Completely blocks me.

2. Environment:

  • Ray version: 2.54
  • Python version: 3.12
  • OS: Ubuntu 24.04
  • Cloud/Infrastructure: Single docker container
  • Other libs/tools (if relevant): FastAPI based app

3. What happened vs. what you expected:

  • Expected: I would like to load machine learning models from a local model registry (a local folder), that is mounted as a docker volume.
  • Actual: The ray cluster creates a copy of the data in the model registry, which are quite heavy.

I am trying to deploy a Ray serve (in combination with FastAPI) in a single container. I need to run ML inferences with many different models, hence I am taking advantage of the multiplexed models feature. The code that loads the models in the multiplexed bit, simply reads the (onnx) model from a local path, which would be my model registry (I do not have a separate service as a Model Registry, just a simple local folder). The model registry is mounted as a volume in the Docker container, since I do not want to include it at build time, given its size. However, when the entrypoint launches the Ray cluster, the model registry is copied and causes issues with memory and disk.

Is there a way to ignore the folder (some sort of rayignore), but at the same time being able to read from the folder?

Thanks,

Marco

Ray Serve does not provide a built-in way to exclude a mounted model registry folder from being copied into the working directory or runtime environment, nor does it support a .rayignore mechanism for this purpose. This means that, by default, Ray may copy the entire working directory—including large mounted volumes—when setting up the runtime environment, which can lead to excessive disk and memory usage, as you observed. There is no documented way to selectively ignore a folder while still allowing read access from within the containerized Ray Serve deployment, according to the available data sources.

A common workaround is to ensure that the model registry folder is not part of the working directory specified in the Ray runtime environment or job submission, and to reference the mounted path directly in your code. However, if you specify the working directory as the parent of the model registry, Ray will attempt to copy everything under it. There is no official support for a .rayignore file or similar exclusion mechanism in Ray’s runtime environment setup as of Ray 2.54. For more details on related issues and workarounds, see discussions at Ray GitHub Issue #59569 and Ray Discuss: Making model accessible across nodes.

Would you like more detail on possible workarounds or deployment patterns to avoid this issue?

Sources:

Hint: Mention @RunLLM in the post for followups.