Ray worker nodes do not launch when aws configure is run

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I am launching Ray cluster using a YAML file and I am facing a strange issue. I am using my own AMI with miniconda installed along with all the other cuda libraries and required ray packages. Normally when I perform a ray up -y cluster.yaml, everything works fine - my ray cluster gets up and running. However if I happen to add aws configure commands under setup_commands, only the head node comes up but none of the worker nodes gets launched. I tried logging into the head node and checking ray status and I get nothing.

Here is what I am modifying (I have simply added the conda activate ray and the aws configure commands):

setup_commands:
    - echo 'export PATH="/home/ubuntu/miniconda3/condabin:$PATH"' >> ~/.bashrc;
        echo 'export PATH="/home/ubuntu/miniconda3/envs/ray/bin:$PATH"' >> ~/.bashrc;
        echo 'export PYTHONPATH=$PYTHONPATH:~/scripts' >> ~/.bashrc;
    - conda activate ray;
        aws configure set aws_access_key_id XXXXXXXXXXXXXXXXX;
        aws configure set aws_secret_access_key XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX;
        aws configure set default.region us-east-1;
        aws configure set default.output text;

Any idea why this is happening?

Thanks @indrajitsg. What do you see in the logs under /tmp/ray/session_latest?
checkout log_monitor.[out|err], monitor.log, monitor.[out|err]

more info on the logs structure:
https://docs.ray.io/en/master/ray-observability/ray-logging.html?highlight=worker%20startup%20logs#id1

Thanks @Ameer_Haj_Ali - I checked monitor.err and monitor.log. Based on the logs, I was able to identify the issue - the specific set of credentials that I was passing didn’t have certain permissions.

Indrajit