Hi Team,
I have deployed Ray on EKS using helm charts. I am also specifying pip runtime_env
during init
.
This is how I am initializing.
ray.init(address="ray://example-cluster-ray-head:10001", namespace="ray", runtime_env={"pip": ["ray[default]", "ray[serve]", "flair==0.9", "nltk==3.6.2"]})
Here is the complete traceback.
❯ ray submit example-full.yaml FlairModel.py
2022-01-13 14:23:54,613 INFO util.py:282 -- setting max workers for head node type to 0
Loaded cached provider configuration
If you experience issues with the cloud provider, try re-running the command with --no-config-cache.
2022-01-13 14:24:06,024 INFO util.py:282 -- setting max workers for head node type to 0
2022-01-13 14:24:06,148 INFO command_runner.py:172 -- NodeUpdater: example-cluster-ray-head-type-s59z8: Running kubectl -n ray exec -it example-cluster-ray-head-type-s59z8 -- bash --login -c -i 'true && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (python ~/FlairModel.py)'
Traceback (most recent call last):
File "/home/ray/FlairModel.py", line 2, in <module>
ray.init(address="ray://example-cluster-ray-head:10001", namespace="ray", runtime_env={"pip": ["ray[default]", "ray[serve]", "flair==0.9", "nltk==3.6.2"]})
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/worker.py", line 775, in init
return builder.connect()
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/client_builder.py", line 155, in connect
ray_init_kwargs=self._remote_init_kwargs)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client_connect.py", line 42, in connect
ray_init_kwargs=ray_init_kwargs)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/__init__.py", line 228, in connect
conn = self.get_context().connect(*args, **kw_args)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/__init__.py", line 88, in connect
self.client_worker._server_init(job_config, ray_init_kwargs)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/worker.py", line 698, in _server_init
f"Initialization failure from server:\n{response.msg}")
ConnectionAbortedError: Initialization failure from server:
Traceback (most recent call last):
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/server/proxier.py", line 624, in Datapath
client_id, job_config):
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/server/proxier.py", line 281, in start_specific_server
specific_server=specific_server,
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/server/proxier.py", line 234, in _create_runtime_env
"Failed to create runtime_env for Ray client "
RuntimeError: Failed to create runtime_env for Ray client server: [Errno 2] No such file or directory: '/tmp/ray/session_2022-01-12_16-25-08_622146_119/runtime_resources/conda/6aa502b772617ee2bec7c93351970b6ed85aa479'
command terminated with exit code 1
2022-01-13 14:28:45,460 ERROR command_runner.py:182 -- NodeUpdater: example-cluster-ray-head-type-s59z8: Command failed:
kubectl -n ray exec -it example-cluster-ray-head-type-s59z8 --'bash --login -c -i '"'"'true && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (python ~/FlairModel.py)'"'"''
Thanks in advance!