ModuleNotFoundError during deployment after upgrade to 1.9.0

blshao84 · December 7, 2021, 4:03am

hi, I’m trying to upgrade my ray cluster to latest 1.9.0 but encountered below error during serve deployment.

@serve class:

@serve.deployment(num_replicas=1)
@serve.ingress(app=serve_app)
class PricingApi(object):

deploy script:

ray.init(address='localhost:6379',
            _redis_password='5241590000000000',
            log_to_driver=False,
            namespace='serve')
PricingApi.deploy()

I run above deploy script from project root and I got errors:


2021-12-07 03:48:37,383	INFO worker.py:842 -- Connecting to existing Ray cluster at address: 10.1.1.14:6379
2021-12-07 03:48:37,652	INFO api.py:242 -- Updating deployment 'PricingApi'. component=serve deployment=PricingApi
{'object_store_memory': 1491309772.0, 'memory': 2982619547.0, 'CPU': 2.0, 'node:10.1.1.14': 1.0}
Traceback (most recent call last):
  File "deploy_ray_serve.py", line 10, in <module>
    deploy_cluster()
  File "deploy_ray_serve.py", line 7, in deploy_cluster
    PricingApi.deploy()
  File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/ray/serve/api.py", line 789, in deploy
    return _get_global_client().deploy(
  File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/ray/serve/api.py", line 93, in check
    return f(self, *args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/ray/serve/api.py", line 248, in deploy
    self._wait_for_goal(goal_id)
  File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/ray/serve/api.py", line 184, in _wait_for_goal
    raise async_goal_exception
RuntimeError: Deployment 'PricingApi' failed, deleting it asynchronously.

Here’s related log from ray’s log:


2021-12-07 03:48:39,670	ERROR worker.py:431 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::SERVE_REPLICA::PricingApi#gxItCf:RayServeWrappedReplica.__init__ (pid=3668, ip=10.1.1.14)
  File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/concurrent/futures/_base.py", line 437, in result
    return self.__get_result()
  File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/ray/serve/replica.py", line 48, in __init__
    deployment_def = cloudpickle.loads(serialized_deployment_def)
ModuleNotFoundError: No module named 'bct'
:actor_name:PricingApi

Any thought?

blshao84 · December 7, 2021, 5:39am

It turns out I have to run ‘serve start’ from my project root … there isn’t such a ‘restriction’ in the past, is it expected?

jiaodong · December 7, 2021, 7:07am

hi @blshao84 thanks for trying out our latest release. I believe we have serve.start() / serve start in our e2e tutorial since ray 1.3 documentation: End-to-End Tutorial — Ray v2.0.0.dev0. It’s required to initialize serve apis. Maybe you had it working with serve.start(detached=True) or executed cli serve start such that your serve has been running on the background, but your recent upgrade required you to restart and it needs to be executed again ?

blshao84 · December 7, 2021, 8:41am

hi @jiaodong, I got above ‘ModuleNotFound’ issue from my regression test, which launched ray and serve as below:

ray start --head
serve start
python deploy.py

In deploy.py:

ray.init(address='localhost:6379', namespace='serve')
PricingApi.deploy()

Before 1.9.0, my ‘serve start’ and ‘python deploy.py’ were not executed from the same directory and it worked fine. But with 1.9.0, I have to run both ‘serve start’ and ‘python deploy.py’ from my project root. I’m totally ok with that (and I think it’s better to do it this way), but I’m just curious is there any specific change in this release that forced it?

jiaodong · December 8, 2021, 6:16pm

That’s interesting … serve start is simply wrapper of calling serve.start() from serve api and it shouldn’t matter where you execute it. How did you install and import module bct in your code, and do you see the same symptom without using it ?

blshao84 · December 11, 2021, 9:01am

‘bct’ is not a 3rd party lib but my own source code. Here’s my project structure:

project_root
      bct
      examples
      tests
      deploy.py
      ...

it only works if I start serve and python deploy.py all from ‘project_root’

blshao84 · December 13, 2021, 10:14am

hi @jiaodong, I got a repro here: GitHub - blshao84/ray-deploy-repro: reproduce ray's serve deployment issue

jiaodong · December 15, 2021, 12:00am

Thanks @blshao84 for the context, this is very comprehensive. I created [Bug] Highlights include: ✔ Ray Train is now in beta ✔ Ray Datasets now supports groupby and aggregations ✔ Ray Docker images for multiple CUDA versions are now provided ✔ Improving Ray stability and usability on Windows ✔ Launching of a Ray Job Submission server + CLI & SDK clients to make it easier to submit and monitor Ray applications And there’s more. Head over to the release blog for the deep dive. ModuleNotFoundError during deployment after upgrade to 1.9.0 · Issue #21095 · ray-project/ray · GitHub to keep track of the issue and will look into it soon.

shrekris · December 22, 2021, 12:52am

@blshao84 Thanks for your patience! We’re still looking into the issue. We have a company holiday until Jan 3, so we’ll be able to take a closer look afterwards. Happy holidays!

blshao84 · January 5, 2022, 1:21am

Any update on this? I just found another similar situation when this issue happened: GitHub - blshao84/ray-deploy-repro at tests_dir

simon-mo · January 13, 2022, 5:49pm

Hi @blshao84, thank you for your reproduction, Ray cannot magically move things and find Python modules without some hints. Please take a look at Handling Dependencies — Ray v1.9.2 and let me know whether it helps!

Topic		Replies	Views
ModuleNotFound: ray.serve.generated RLlib	1	314	March 16, 2022
The example of ray serve deploying a service using serve deploy does not work Ray Serve	0	322	September 6, 2023
Changes in subsequent versions Ray Serve	1	471	March 14, 2022
Module error when deploying app Kubernetes	1	395	February 29, 2024
Ray Serve multi application fails importing module Ray Serve	6	1000	May 17, 2023

ModuleNotFoundError during deployment after upgrade to 1.9.0

Related topics