Failed to create runtime_env for Ray client server: [Errno 2] No such file or directory

Championzb · February 4, 2022, 7:22pm

Hi,

I follow the step here to install Ray in Kubernetes using helm.
The python version is 3.7.7 and ray version is 1.9.2.

I port-forwarded the svc 10001 to my local and run the following script:

import ray
import requests

runtime_env = {"pip": ["requests", "ray[serve]"]}
ray.init(address="ray://127.0.0.1:10001", runtime_env=runtime_env)


@ray.remote
def reqs():
    return requests.get("https://www.ray.io/")


if __name__ == "__main__":
    print(ray.get(reqs.remote()))

Getting this error:

ConnectionAbortedError: Initialization failure from server:
Traceback (most recent call last):
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/server/proxier.py", line 624, in Datapath
    client_id, job_config):
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/server/proxier.py", line 281, in start_specific_server
    specific_server=specific_server,
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/server/proxier.py", line 234, in _create_runtime_env
    "Failed to create runtime_env for Ray client "
RuntimeError: Failed to create runtime_env for Ray client server: [Errno 2] No such file or directory: '/tmp/ray/session_2022-02-04_08-56-23_262496_116/runtime_resources/conda/c390bf3cf7e61e5e2ee55126ce55ec8f1eb8e565'

The same script works against a local ray server. Any ideas?

Thanks in advance!

ckw017 · February 10, 2022, 7:03pm

@Championzb can you try rerunning and checking the contents of the /tmp/ray/session_***/runtime_resources directory on the head node (match session_*** to the path in the error you’re seeing)

cc @architkulkarni any guesses here?

architkulkarni · February 10, 2022, 8:03pm

@Championzb Sorry you’re running into this, I’m not sure what the problem could be off the top of my head… Is it possible to see if the issue persists on Ray 1.10.0?

Also, are there any relevant logs in /tmp/ray/session_***/logs on the head node? For example, dashboard_agent.log or ray_client_server logs?

Championzb · February 10, 2022, 8:10pm

Hey @ckw017 @architkulkarni , thank you for the response. I’ve found the issue. By default, head node was deployed with 512Mi memory. After increasing it to 2Gi, the issue is gone.

architkulkarni · February 10, 2022, 10:02pm

That’s great! Any idea what exactly was running up against the memory limit? What gave you the idea to increase the memory? It could be helpful for us to know as we improve our error messages and failure handling.

Championzb · February 11, 2022, 12:46am

Hi @architkulkarni ? No ideas, I am just running very simple script. Just accidentally notice the memory increased on dashboard when I run the job.

Topic		Replies	Views
Ray client fails when specifying Conda Environment Kubernetes	4	1456	December 6, 2021
Can't connect to ray cluster when passing `runtime_env` to `ray.init` Ray Client	2	249	July 12, 2024
Ray submit fails on EKS, runtime_env creation error Kubernetes	2	669	January 19, 2022
Conda run_env in custom docker image for Kuberay Kubernetes	0	162	March 25, 2024
Ray on AKS using Kubernetes Job with runtime_env working_dir throws error Kubernetes	6	1063	January 21, 2022

Failed to create runtime_env for Ray client server: [Errno 2] No such file or directory

Related topics