Install modin or any pip package on ray cluster

  • I installed ray on a kubernetes cluster with helm and managed to connect to it with ray.client.
  • I added
setup_commands:
    - pip install modin[ray]

to ray/python/ray/autoscaler/kubernetes/example-full.yaml and did ray up

  • i’m able to connect to the ray cluster with ray.client. But when I run modin operations i get
...
~/.pyenv/versions/3.7.7/envs/env_ray/lib/python3.7/site-packages/ray/util/client/worker.py in _call_schedule_for_task(self, task)
    339         if not ticket.valid:
    340             try:
--> 341                 raise cloudpickle.loads(ticket.error)
    342             except pickle.UnpicklingError:
    343                 logger.exception("Failed to deserialize {}".format(

ModuleNotFoundError: No module named 'modin'

manually ssh-ing to the head and the workers and running “pip install modin[ray]” solves the issue but of course that is not a very good permanent solution. it just shows that the issue is indeed missing modin on the cluster.

What am i doing wrong? How do I do it right?
Thanks for your help ! :slight_smile:

In case someone else has problems with this I’ll try to answer this myself:

TLDR: Don’t use helm for ray on kubernetes

I stil don’t know why this did not work, but i found that settingu up ray with helm is definitely not the easiest way.
using ray.up with the ray on kubernetes default config from the ray repo works at least as easily. also setup_commands do work then.
Good luck!

The solution is to either build the dependency into the image you’re using or try runtime environments: Advanced Usage — Ray v1.4.1

1 Like

Since I use ray up with a config for kubernetes instead of installing the cluster with helm setup_commands also works as expected.
But the option to dynamically define the runtime environment is a really nice additional option for flexible use!
Thanks for the hint Dmitri! :slight_smile:

1 Like