Looking for a couple explanations for some behaviors that appear not wholly explained in available documentation/issues. Really wasn’t sure the best way to ask, so included an example of the behaviors being exhibited. Ideally, would like to use runtime_env arguments in ray.init and have a Pool use the processes started in that cluster, rather than starting more (they do not need to utilize the runtime_env, though would not have an issue if they did). Thanks in advanced for any thoughts!
When a worker_process_setup_hook (and/or env_vars) are specified within an ray.init call through runtime_env, extra processes are started by subsequent creation of a multiprocessing pool. This somewhat complicates resource management (allows more active processes than there are processors). Manageable…sure, annoying, absolutely! If runtime_env arguments are not provided during ray.init, then subsequent Pool call uses/connects to processes already initialized in the cluster.
-
There does not appear to be a way to add runtime_env arguments to the pool startup, such that existing cluster processes would be used (as is done when runtime_env arguments are not specified/included during ray.init). Is there a way to do so?
-
env_vars can be specified under runtime_env arguments within a ray.remote header. However, worker_process_setup_hook cannot; returning a TypeError: Object of type function is not JSON serializable). If there is a way to do so, then this would be a suitable workaround; using runtime_env arguments per-actor rather than globally through ray.init.
-
A less important side query: Running ray.init with num_cpus=2 opens 3 additional python processes? What is the 3rd process? Manager for the other 2? Subsequently, starting a pool (with runtime_env having been specified during the ray.init call) doubles the number of processes from 4 to 8…would appreciate some insights into what is going on here.
Example
#Windows 11 system with 32 threads
#Python process count from task manager
import ray
from ray.util.multiprocessing import Pool
def hook(): return None
#Running without a setup_hook
### 1 Python process running here
cxt = ray.init(num_cpus=2, include_dashboard=False)
### 4 Python processes running here; 1 original, 2
pool = Pool(ray_address='auto')
### 4 Python processes running here
#Reset
ray.shutdown()
#With setup hook
### 1 Python process running here
cxt = ray.init(num_cpus=2, include_dashboard=False, runtime_env={"worker_process_setup_hook": hook})
### 4 Python processes running here
pool = Pool(ray_address='auto')
### 8 Python processes running here