How to limit number of concurrent tasks in dask on ray?

How severe does this issue affect your experience of using Ray?

  • Low: It annoys or frustrates me for a moment.

I’m super happy with the stability on dask on ray on kuberay but at the moment I think the only way to control the amount of parallelism is by tweaking the number of dask array chunks (e.g., one chunk == 1 task?)

Is there another way? Because if I have a (delayed) 1 PB dask array the chuck size is getting quite large (exceeding one worker’s main memory) to make sure not too many tasks are running in parallel

moved to stackoverflow kubernetes - How to limit # of parallel tasks on kuberay when computing a dask array? - Stack Overflow

Some documents I found:

Maybe you can also specify worker group’s maxReplicas (example). Autoscaler will not make worker Pods more than maxReplicas. It may prevent the additional workers from causing the Kubernetes cluster OOM as you mentioned in the Stack Overflow post.

The problem is that Ray fires all 233280 tasks at Ray at the same time and Ray at the same time tries to start additional workers but then eventually needs to kill them again because of out of memory (OOM) errors