Ray Client + Dask on Ray?

Hello!

I am trying to use Dask on Ray, and Ray Client.

I am just wondering if this integration is supposed to work.

My code looks something like…


import ray
ray.util.connect("10.0.0.211:10001")  # replace with the appropriate host and port

import dask
from ray.util.dask import ray_dask_get

dask.config.set(scheduler=ray_dask_get)

func = lambda x, y: x + y
x = da.arange(8).reshape(2, 4).rechunk((1, 2))
y = da.arange(4).rechunk(2)
da.map_overlap(func, x, y, depth=1).compute(scheduler=ray_dask_get)


Above compute call fails with something like …


(pid=15372, ip=10.0.220.22) 2021-04-20 00:50:27,696	ERROR worker.py:77 -- Unhandled error (suppress with RAY_IGNORE_UNHANDLED_ERRORS=1): ray::dask:('rechunk-split-rechunk-merge-b732b7c02930dfce2158436c45af0891', 0) (pid=15372, ip=10.0.220.22)
(pid=15372, ip=10.0.220.22)   File "python/ray/_raylet.pyx", line 501, in ray._raylet.execute_task
(pid=15372, ip=10.0.220.22)   File "/home/ec2-user/anaconda3/envs/miami_ml_base/lib/python3.7/site-packages/ray/util/dask/scheduler.py", line 346, in dask_task_wrapper
(pid=15372, ip=10.0.220.22) TypeError: 'ray._raylet.ObjectRef' object is not subscriptable

....


(pid=15372, ip=10.0.220.22)   File "python/ray/_raylet.pyx", line 501, in ray._raylet.execute_task
(pid=15372, ip=10.0.220.22)   File "/home/ec2-user/anaconda3/envs/miami_ml_base/lib/python3.7/site-packages/ray/util/dask/scheduler.py", line 346, in dask_task_wrapper
(pid=15372, ip=10.0.220.22)   File "/opt/amazon/lib/python3.7/site-packages/dask/array/core.py", line 4409, in concatenate3
(pid=15372, ip=10.0.220.22)     chunks = chunks_from_arrays(arrays)
(pid=15372, ip=10.0.220.22)   File "/opt/amazon/lib/python3.7/site-packages/dask/array/core.py", line 4180, in chunks_from_arrays
(pid=15372, ip=10.0.220.22)     result.append(tuple([shape(deepfirst(a))[dim] for a in arrays]))
(pid=15372, ip=10.0.220.22)   File "/opt/amazon/lib/python3.7/site-packages/dask/array/core.py", line 4180, in <listcomp>
(pid=15372, ip=10.0.220.22)     result.append(tuple([shape(deepfirst(a))[dim] for a in arrays]))
(pid=15372, ip=10.0.220.22) IndexError: tuple index out of range

cc @Clark_Zinzow Is this the known issue?

1 Like

@sangcho @jennakwon06 This is a known issue and should be fixed in latest master. @jennakwon06 I’m assuming that this is occurring with Ray 1.2 and not the latest master wheel?

1 Like

Also @sangcho this is the fix that we decided not to pick up for the 1.3 release, so this will continue to be broken until the 1.4 release.

I am using this wheel: pip install -U https://ray-wheels.s3-us-west-2.amazonaws.com/master/190ab40645a241974a87ebeb99e7989886347ba4/ray-2.0.0.dev0-cp37-cp37m-manylinux2014_x86_64.whl

@Clark_Zinzow Gotcha. To confirm - I can expect this to work if I use the latest commit on this 2.0.0 wheel, right?

@jennakwon06 That should be the case! Definitely let us know if it still doesn’t work.