[Dask on Ray] Low cluster utilization

igolant · December 28, 2022, 12:45pm

How severe does this issue affect your experience of using Ray?

Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Hi,

I have a problem running dask on ray:
My configuration: 4 nodes, 4 cpus each = 16 cpus in total
My use case:

Simple broadcast join, left is large parquet based dask dataframe, right is small pandas dataframe

a = dd.read_parquet(...) # dask dataframe
a.npartitions = 200

b = pd.read_csv(...) # pandas dataframe

def my_presubmit_cb(task, key, deps):
    print(f"About to submit task {key}!")

with RayDaskCallback(ray_presubmit=my_presubmit_cb):
    rr = a.merge(b, on=[....], how="inner", broadcast=True
    ).compute(scheduler=ray_dask_get)

I can see that cluster utilization is very low
there are approximately 4 tasks running in parallel, though i have 16 cpus and potential parallelism is 200.

I was looking at dask-on-ray scheduler code and saw

github.com

ray-project/ray/blob/ray-2.1.0/python/ray/util/dask/scheduler.py#L182


      
          object_refs = get_async(
              _apply_async_wrapper(
                  pool.apply_async,
                  _rayify_task_wrapper,
                  ray_presubmit_cbs,
                  ray_postsubmit_cbs,
                  ray_pretask_cbs,
                  ray_posttask_cbs,
                  scoped_ray_remote_args,
              ),
              len(pool._pool),
              dsk,
              keys,
              get_id=_thread_get_id,
              pack_exception=pack_exception,
              **kwargs,
          )
          if ray_postsubmit_all_cbs is not None:
              for cb in ray_postsubmit_all_cbs:
                  cb(object_refs, dsk)
          # NOTE: We explicitly delete the Dask graph here so object references

Am i correct that number of active task is limit by pool size (in my case, num_cpus on driver = 4)?
Can i increase parallelism?

Thank you.

Topic		Replies	Views
Dask on Ray + Ray Distributed Cluster - Workers not getting used? Ray Core	9	739	February 14, 2021
How to limit number of concurrent tasks in dask on ray? Ray Clusters	2	548	March 9, 2023
Ray Client + Dask on Ray? Ray Client	5	948	April 21, 2021
Dask on Ray – custom resources Ray Core	6	564	March 25, 2022
Cluster not being utilized Ray Core	4	462	March 30, 2021

[Dask on Ray] Low cluster utilization

Related topics