Accessing Ray cluster in AWS

I created a Ray cluster on AWS using this script. I can submit tasks to the cluster by logging on the head node. However, when I call ray.init from a Jupyter notebook in SageMaker, it times out.

There is a network connection between the notebook instance and the cluster node since it gave me a password mismatch error when I intentionally provided a bad Redis password.

Can someone please help me debug this issue?


Hey, ray.init() makes the assumption that it is being called from within the same VPC, therefore it requires essentially all ports on all workers to be accessible, which is naturally not a great security practice.

I’d recommend trying Ray Client. It’s still a beta feature, but it’s built for this exact purpose – running a ray driver from outside the cluster.

I am getting: AttributeError: module 'ray.util' has no attribute 'connect'.
The Ray version on my notebook instance is ray-1.2.0.dev0.

I installed it with the following commands:
pip install -U ray
ray install-nightly

Ah we recently broke that command when we did a version bump. Can you grab the nightly from here? Installing Ray — Ray v2.0.0.dev0

Your ray version should be 2.0.0.dev0. After this, ray install-nightly should work again.

Hi Alex,
Thanks for the pointer. I was able to connect to the Ray cluster from a Jupyter notebook running on a different host. Remote calls went through fine.

Next, I tried to load a data frame with Modin:

import modin.pandas as pd
import numpy as np

frame_data = np.random.randint(0, 100, size=(2**10, 2**8))
df = pd.DataFrame(frame_data)

But, I am getting: “Exception: Trying to start two instances of ray via client”.

Modin is calling ray.init() internally. Is there a workaround for this?


Similar to Modin with ray client on k8s