Something may have gone wrong while installing the runtime_env `pip` packages

Hi Folks! I am trying to run my application via ray-client using the following code:

ray.init(address=ray_address,runtime_env={“working_dir”:“./”,“excludes”:excludes,“py_modules”:[“src/word_autocomplete”],“pip”:“/home/jackson/Next_Word_Autocomplete/word-autocomplete/requirements.txt”})

But it show error:

ValueError: Local directory /tmp/ray/session_2023-08-07_09-42-41_348346_2128188/runtime_resources/pip/be761d619fdc1882001721f8dcc62801c604c6f1 for URI pip://be761d619fdc1882001721f8dcc62801c604c6f1 does not exist on the cluster. Something may have gone wrong while installing the runtime_env pip packages.

Did anyone know why does this occur?

@Jackson What’s your address=ray_address? Also, is this a cluster or local single node?

cc: @architkulkarni Have you seen this before?

Are you able to see any relevant runtime env setup logs on any of the relevant nodes? Configuring Logging — Ray 2.6.1

Hi @Jules_Damji . My ray_address is “ray://192.168.1.76:10001”. This is the head node of the cluster.

@architkulkarni Yes, this is the error it shows.

Installing collected packages: torch
Attempting uninstall: torch
Found existing installation: torch 2.0.1
Uninstalling torch-2.0.1:
Successfully uninstalled torch-2.0.1
Successfully installed torch-2.0.1+cu117

2023-08-09 14:25:24,471 INFO pip.py:187 – Skip pip check.
2023-08-09 14:25:24,472 INFO utils.py:76 – Run cmd[4] [‘/tmp/ray/session_2023-08-09_13-53-16_429892_3649853/runtime_resources/pip/a920087bf3b915e51b6880562fe260b5332a3d52/virtualenv/bin/python’, ‘-c’, ‘\nimport ray\nwith open(r"/tmp/check_ray_version_tempfiler__nav46/ray_version.txt", “wt”) as f:\n f.write(ray.version)\n f.write(" ")\n f.write(ray.path[0])\n ‘]
2023-08-09 14:25:25,128 INFO utils.py:99 – No output for cmd[4]
2023-08-09 14:25:25,129 INFO pip.py:236 – try to write ray version information in: /tmp/check_ray_version_tempfiler__nav46/ray_version.txt
2023-08-09 14:25:25,861 INFO uri_cache.py:84 – Added URI pip://a920087bf3b915e51b6880562fe260b5332a3d52 with size 7539223515
2023-08-09 14:25:25,862 INFO pip.py:293 – Cloning virtualenv /home/jackson/Next_Word_Autocomplete/autocomplete_ray to /tmp/ray/session_2023-08-09_13-53-16_429892_3649853/runtime_resources/pip/a920087bf3b915e51b6880562fe260b5332a3d52/virtualenv
2023-08-09 14:25:25,862 INFO utils.py:76 – Run cmd[5] [’/home/jackson/Next_Word_Autocomplete/autocomplete_ray/bin/python’, ‘/home/jackson/Next_Word_Autocomplete/autocomplete_ray/lib/python3.8/site-packages/ray/_private/runtime_env/_clonevirtualenv.py’, ‘/home/jackson/Next_Word_Autocomplete/autocomplete_ray’, ‘/tmp/ray/session_2023-08-09_13-53-16_429892_3649853/runtime_resources/pip/a920087bf3b915e51b6880562fe260b5332a3d52/virtualenv’]
2023-08-09 14:25:25,947 INFO utils.py:97 – Output of cmd[5]: Usage: _clonevirtualenv.py [options] /path/to/existing/venv /path/to/cloned/venv

_clonevirtualenv.py: error: dest dir ‘/tmp/ray/session_2023-08-09_13-53-16_429892_3649853/runtime_resources/pip/a920087bf3b915e51b6880562fe260b5332a3d52/virtualenv’ exists

2023-08-09 14:25:25,948 INFO pip.py:416 – Delete incomplete virtualenv: /tmp/ray/session_2023-08-09_13-53-16_429892_3649853/runtime_resources/pip/a920087bf3b915e51b6880562fe260b5332a3d52
2023-08-09 14:25:28,351 ERROR pip.py:418 – Failed to install pip packages.
Traceback (most recent call last):
File “/home/jackson/Next_Word_Autocomplete/autocomplete_ray/lib/python3.8/site-packages/ray/_private/runtime_env/pip.py”, line 388, in _run
await self._create_or_get_virtualenv(path, exec_cwd, logger)
File “/home/jackson/Next_Word_Autocomplete/autocomplete_ray/lib/python3.8/site-packages/ray/_private/runtime_env/pip.py”, line 330, in _create_or_get_virtualenv
await check_output_cmd(create_venv_cmd, logger=logger, cwd=cwd, env=env)
File “/home/jackson/Next_Word_Autocomplete/autocomplete_ray/lib/python3.8/site-packages/ray/_private/runtime_env/utils.py”, line 101, in check_output_cmd
raise SubprocessCalledProcessError(
ray._private.runtime_env.utils.SubprocessCalledProcessError: Run cmd[5] failed with the following details.
Command ‘[’/home/jackson/Next_Word_Autocomplete/autocomplete_ray/bin/python’, ‘/home/jackson/Next_Word_Autocomplete/autocomplete_ray/lib/python3.8/site-packages/ray/_private/runtime_env/_clonevirtualenv.py’, ‘/home/jackson/Next_Word_Autocomplete/autocomplete_ray’, ‘/tmp/ray/session_2023-08-09_13-53-16_429892_3649853/runtime_resources/pip/a920087bf3b915e51b6880562fe260b5332a3d52/virtualenv’]’ returned non-zero exit status 2.
Last 50 lines of stdout:
Usage: _clonevirtualenv.py [options] /path/to/existing/venv /path/to/cloned/venv

_clonevirtualenv.py: error: dest dir '/tmp/ray/session_2023-08-09_13-53-16_429892_3649853/runtime_resources/pip/a920087bf3b915e51b6880562fe260b5332a3d52/virtualenv' exists

Thanks for the details! This looks like it might be a bug in Ray. Would you mind posting this as an issue on the Ray Github? I can pull in the appropriate people. It would also be helpful if you could say how often this occurs and give details about your setup.

As a workaround, you can try using Ray Job Submission instead of Ray Client, let us know if the same error occurs.

@architkulkarni thanks for following up. @Jackson When you file the bug issues, we will triage it. Please put the link to the issue. That way we can close this, and track the problem on the github.

thanks!

Hey @Jules_Damji @architkulkarni thanks for the follow up, I have already submitted an issue related to this on here. Just for your information, I also attempted to use the ray job submit command, but unfortunately, I encountered the same error once again.