Is it a good idea to start a large number of tasks all at once?

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.


I am new to ray and I have a problem where I need to launch a large number of independent tasks (e.g., 100,000-ish) on a limited number of GPUs (~8). I am wondering if it is a good practice to launch those tasks all at once via foo.remote() and have ray to handle the scheduling work. If it is not, what should be the proper implementation?


According to ray/release/benchmarks at master · ray-project/ray · GitHub, it is not recommended to submit more than 1Million tasks at a time (# of tasks queued on a single node). If you submit 100,000 tasks, it will be just queued, so it must be okay. You can also control the # of concurrent tasks using num_cpus or num_gpus flag for each task.

1 Like