No module named ... in ray cluster

How severe does this issue affect your experience of using Ray?

  • None: Just asking a question out of curiosity
  • Low: It annoys or frustrates me for a moment.
  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.
  • High: It blocks me to complete my task.

Hi~ I just make a AI service using Ray cluster

now, I faced with a problem.
It just working well when I use ray cluster (ray start & ray.init)
but, An error occured when I use ray cluster within airflow’s dag (same project, same directory structure)

so, I find some information about runtime_env in ray’s docs, but It was not working uu.

below an error message of airflow

[Note : you can see an word ‘actors’, It just a folder name]

[2022-11-17, 15:42:37 KST] {newsGathering.py:111} INFO - status: 400   errormsg: rootairflowdagsmeerkatutils>processhandler.py  status:4042 --> The actor died because of an error raised in its creation task, e[36mray::Crawling.__init__()e[39m (pid=2081179, ip=172.17.0.8, repr=<actors.flows.collect.crawling.FunctionActorManager._create_fake_actor_class.<locals>.TemporaryActor object at 0x7f393a8608b0>)
RuntimeError: The actor with name Crawling failed to import on the worker. This may be because needed library dependencies are not installed in the worker environment:

e[36mray::Crawling.__init__()e[39m (pid=2081179, ip=172.17.0.8, repr=<actors.flows.collect.crawling.FunctionActorManager._create_fake_actor_class.<locals>.TemporaryActor object at 0x7f393a8608b0>)
ModuleNotFoundError: No module named 'actors'
Traceback (most recent call last):
  File "/root/airflow/dags/meerkat/utils/processhandler.py", line 76, in parellelRun
    rtn = ray.get(funcArr)
  File "/usr/local/lib/python3.8/dist-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 1765, in get
    raise value
ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, e[36mray::Crawling.__init__()e[39m (pid=2081179, ip=172.17.0.8, repr=<actors.flows.collect.crawling.FunctionActorManager._create_fake_actor_class.<locals>.TemporaryActor object at 0x7f393a8608b0>)
RuntimeError: The actor with name Crawling failed to import on the worker. This may be because needed library dependencies are not installed in the worker environment:

e[36mray::Crawling.__init__()e[39m (pid=2081179, ip=172.17.0.8, repr=<actors.flows.collect.crawling.FunctionActorManager._create_fake_actor_class.<locals>.TemporaryActor object at 0x7f393a8608b0>)
ModuleNotFoundError: No module named 'actors'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/airflow/dags/meerkat/tasks/newsGathering.py", line 99, in flow
    parellelRun(funcs)
  File "/root/airflow/dags/meerkat/utils/processhandler.py", line 82, in parellelRun
    raise CommonException(str(e), Status.RAY_PARELLEL_RUN_FAIL.value, getPath(__file__))
base_structures.exceptions.common.CommonException: rootairflowdagsmeerkatutils>processhandler.py  status:4042 --> The actor died because of an error raised in its creation task, e[36mray::Crawling.__init__()e[39m (pid=2081179, ip=172.17.0.8, repr=<actors.flows.collect.crawling.FunctionActorManager._create_fake_actor_class.<locals>.TemporaryActor object at 0x7f393a8608b0>)
RuntimeError: The actor with name Crawling failed to import on the worker. This may be because needed library dependencies are not installed in the worker environment:

e[36mray::Crawling.__init__()e[39m (pid=2081179, ip=172.17.0.8, repr=<actors.flows.collect.crawling.FunctionActorManager._create_fake_actor_class.<locals>.TemporaryActor object at 0x7f393a8608b0>)
ModuleNotFoundError: No module named 'actors'

plz, help me~

@YoungJoon88 I still think it’s a runtime env issue. How do you use the runtime env?

Do you mind running some remote task just to check the working dir to make sure the actors folder is there?

Hi yic, Thank you for your answer.

I set the working_dir several times such as below.
and the ‘actors’ folder exists in meerkat folder.

host = 127.0.0.1 # E.g.
runtime_envs = {"working_dir": "/root/airflow/dags/meerkat"} # 1st
runtime_envs = {"working_dir": "/root/airflow/dags/meerkat/."} # 2nd
ray.init(address = host, runtime_env = runtime_envs)

in addition, below some sentences are the Environment variable of own server and there is two cases.
<Case 1. Execute own shell script and ray.init without airflow, Working well.>
‘/usr/local/lib/python3.8/dist-packages/ray/thirdparty_files’
‘/usr/local/lib/python3.8/dist-packages/ray/pickle5_files’
‘/usr/local/lib/python3.8/dist-packages/ray/workers’
‘/root/airflow/dags/meerkat/libs’
‘/usr/lib/python38.zip’
‘/usr/lib/python3.8’
‘/usr/lib/python3.8/lib-dynload’
‘/usr/local/lib/python3.8/dist-packages’
‘/usr/lib/python3/dist-packages’

p.s Own shell script
ray start --head --node-ip-address host --object-store-memory 1234567890

<Case 2. Execute BashOperator and PythonOperator with airflow, Not working uu>
‘/usr/local/lib/python3.8/dist-packages/ray/thirdparty_files’
‘/usr/local/lib/python3.8/dist-packages/ray/pickle5_files’
‘/usr/local/bin’,
‘/usr/lib/python38.zip’
‘/usr/lib/python3.8’
‘/usr/lib/python3.8/lib-dynload’
‘/usr/local/lib/python3.8/dist-packages’
‘/usr/lib/python3/dist-packages’
‘/root/airflow/dags’
‘/root/airflow/config’
‘/root/airflow/plugins’
‘/root/airflow/dags/meerkat’

p.s BashOperator execute a same shell script I said

Hi dear~
first, Thank you for your answer.

well, I remained a message about runtime_env i set.
but, A spam filter blocked my reply uu.

anyway, I talk about runtime_env again,

I set the runtime_env below
first, execute ray start by BashOperator of airflow

ray start --head --node-ip-address host1 --dashboard-host host2 --port port1 --dashboard-port port2 --object-store-memory 1234567890

second, run ray.init by PythonOperator

ray.init(address=host1, runtime_env={"working_dir": "/root/airflow/dags/meerkat/"})

and the actors folder place under /root/airflow/dags/meerkat/

in addition, PYTHONPATH…
‘/root/airflow/dags/meerkat’
‘/usr/local/lib/python3.8/dist-packages/ray/thirdparty_files’,
‘/usr/local/lib/python3.8/dist-packages/ray/pickle5_files’,
‘/usr/local/bin’,
‘/usr/lib/python38.zip’,
‘/usr/lib/python3.8’,
‘/usr/lib/python3.8/lib-dynload’,
‘/usr/local/lib/python3.8/dist-packages’,
‘/usr/lib/python3/dist-packages’,
‘/root/airflow/dags’,
‘/root/airflow/config’,
‘/root/airflow/plugins’,

Thank you :slight_smile:

Are they in the same node? Sorry I’m not very familiar with airflow.

From your list of the folder, there is no actors folder?

To make it work, it has to be in your working_dir or you can use py_modules (Environment Dependencies — Ray 2.1.0)

Basically, you need a way to tell ray to sync your file to the remote nodes.

Yes same node and there is actors folder.

I’m finding Envionment Dependencies.

Thank you for your help.