Ray.init() hanging with conda (pip) installation

nrclaudio · April 11, 2022, 2:18pm

High: It blocks me to complete my task.

Hi all,

I’m trying to setup Ray in my workflow. I’ve started by doing a clean install using conda. However I’m facing this exact same problem: Ray workers unable to register when used with "venv"-created virtual environment on Windows with Python 3.7.3+ · Issue #13794 · ray-project/ray · GitHub.

What I’ve done:

conda create -n ray python=3.7
conda activate ray
pip install ray[tune]

Then, within python:

import ray
ray.init()

This hangs and gives no output. If I look at the logs (e.g. raylet.err), I get the exact same error mentioned in the above GitHub issue:

(raylet) worker_pool.cc:481: Some workers of the worker process(2841883) have not registered within the timeout. The process is still alive, probably it's hanging during start.

However, if I run:

import ray
ray.init(num_cpus=1)

Then ray initializes correctly.

I’m using a shared HPC infrastracture running on CentOS Stream 8.

Versions:

Ray 1.11.0
Python 3.7.8
CentOS Stream 8

Clark_Zinzow · April 20, 2022, 2:48am

I think that Ray will start a worker process per detected CPU core; if you’re using shared HPC infra, could Ray possibly be trying to start a ton of worker processes?

Topic		Replies	Views
Ray.init() hangs Ray Core	2	931	July 8, 2021
[Core] Ray.init() hanging Ray Core	5	2492	December 21, 2021
Ray init fails to register workers Ray Core	9	2755	August 17, 2022
Ray 1.7.0 ray.init(runtime_env=) kills cluster (was: cluster stuck on "The actor or task with ID [] cannot be scheduled right now") Ray Core	5	1263	October 18, 2021
(raylet) Some workers of the worker process(68497) have not registered within the timeout. The process is still alive, probably it's hanging during start Ray Clusters	4	2445	May 26, 2022

Ray.init() hanging with conda (pip) installation

Related topics