How to invoke conda env in subprocess during a ray job?

fredchang · January 6, 2024, 5:03pm

How severe does this issue affect your experience of using Ray?

High: It blocks me to complete my task.

The goal is to have python invoke a command line program via subprocess() that is currently installed via conda.
This command line program is installed via conda
Subprocess invokes the conda environment to run this command line program.

Given that ray mangles the conda environment name in runtime_env definitions, what is the best approach to support this use-case?

So far, I see two paths:

(hacky) search the given conda environments on the ray worker, search within each conda env the command line program, and invoke subprocess accordingly.
(better?) convert the conda install of the command line program to apt-get package and provide the command line access at the OS level by installing the program at the setup step in the cluster.yaml definition.

Any other thoughts are appreciated.

Stephanie_Wang · January 8, 2024, 7:16pm

Hmm not sure that I understand the issue exactly. Can you not invoke the subprocess with the full path of the desired conda env? E.g., /home/<user>/anaconda3/envs/<env>/bin/...?

fredchang · January 9, 2024, 4:48pm

Hi Stephanie,

You are correct.

So to clarify, my original question was a general question of how to invoke subprocess command line programs in python within a ray process setup with a conda environment – the issue being that the user does not know apriori the conda name of the environment ray sets up for you.

The solution :

-You need a handle to the conda environment that ray activated on behalf of your defined environment.

Since ray did not provide conda labels for this environment, you will get a conda path to your environment.
then you need to execute ‘conda run -p’ with the -p flag for
path.

Below is a self contained example that you can submit to your ray cluster

ray job submit --runtime-env my_ray_env.yml --address http://localhost:8265 --working-dir . -- python test_subprocess_and_conda.py

#my_ray_env.yml
conda:
  channels:
    - conda-forge
  dependencies:
    - openbabel

working-dir: "."

#test_subprocess_and_conda.py
import ray
from ray.runtime_env import RuntimeEnv

import subprocess
import shlex
import os

runtime_env = RuntimeEnv(conda={
    "channels": ["conda-forge"], "dependencies": ["openbabel"]})

ray.init()

@ray.remote(runtime_env=runtime_env)
def f(x):
    # get the conda environment ray setup for you and activated (NB: this is if conda was used to setup env in ray)
    active_conda_env = os.environ['CONDA_DEFAULT_ENV']
    # the conda environment label is passed as the path (ray did not define a conda env name), so you use the -p flag to invoke this environment.
    output = subprocess.run(shlex.split(f"conda run -p {active_conda_env} obabel -h"),capture_output=True,text=True,shell=False)
    print(output.stdout)

futures = [f.remote(i) for i in range(4)]
print(ray.get(futures))```

Stephanie_Wang · January 9, 2024, 8:52pm

Hmm sorry I think I’m still misunderstanding – does the script you provided work as intended or is there still an issue?

fredchang · January 10, 2024, 12:55am

Is the working script for those that want to run subprocess commands within ray python and conda

fredchang · January 10, 2024, 2:42am

to futher clarify, there is no more issue
thanks stephanie.

Topic		Replies	Views
Using conda in runtime_env Ray Core	5	1828	September 14, 2023
Using custom path to conda installation with `runtime_env` Ray Clusters	1	751	September 20, 2021
Could not find Conda executable at Conda Kubernetes	1	474	November 30, 2023
[docker] cannot use runtime env with ray docker: No module named 'ray' Ray Core	2	269	February 19, 2024
Conda run_env in custom docker image for Kuberay Kubernetes	0	161	March 25, 2024

How to invoke conda env in subprocess during a ray job?

Related topics