Use an image from a private registry in Ray cluster config

Pulling from a Private Registry

How do we authenticate to a private image registry (like ECR) in the docker section of the ray config?


docker:
  image: ****.dkr.ecr.us-east-2.amazonaws.com/***/***:v0.1.1   # private image registry
  container_name: "ray_cpu"
  pull_before_run: True
  run_options:
    - --ulimit nofile=65536:65536

Hi, I am also running into this issue using Azure Container Registry and wondering if anyone has a solution? I have logged into my ACR from the host OS (and can pull from the ACR), but when I run ray up example-full.yaml I get:

Error response from daemon: Head “https://myregistry.azurecr.io/v2/myimage/manifests/mytag”: unauthorized:
Shared connection to xx.x.x.xx closed.
2025-01-22 04:45:37,636 INFO node_provider.py:114 – ClusterState: Writing cluster state: [‘xx.x.x.xx’]
New status: update-failed
!!!
Exception details: {‘message’: ‘SSH command failed.’}
Full traceback: Traceback (most recent call last):
File “/usr/local/lib/python3.10/dist-packages/ray/autoscaler/_private/updater.py”, line 159, in run
self.do_update()
File “/usr/local/lib/python3.10/dist-packages/ray/autoscaler/_private/updater.py”, line 451, in do_update
self.cmd_runner.run_init(
File “/usr/local/lib/python3.10/dist-packages/ray/autoscaler/_private/command_runner.py”, line 722, in run_init
self.run(
File “/usr/local/lib/python3.10/dist-packages/ray/autoscaler/_private/command_runner.py”, line 493, in run
return self.ssh_command_runner.run(
File “/usr/local/lib/python3.10/dist-packages/ray/autoscaler/_private/command_runner.py”, line 379, in run
return self._run_helper(
File “/usr/local/lib/python3.10/dist-packages/ray/autoscaler/_private/command_runner.py”, line 298, in _run_helper
raise click.ClickException(fail_msg) from None
click.exceptions.ClickException: SSH command failed.

Error message: SSH command failed.
!!!

Failed to setup head node.

@Ben_H I had to do 2 things to get this working on AWS:

1. Edit the role permissions of my IAM Instance Profile to allow pulls from ECR

2. Use the ECR Credential Helper so I could map my IAM Instance Role permissions to use on the node

initialization_commands:
  - |
    mkdir -p $HOME/.docker && \
      echo '{ "credsStore": "ecr-login" }' > $HOME/.docker/config.json && \
      sudo apt update && \
      sudo apt install -y amazon-ecr-credential-helper