On Ray issue "Ray starts too many workers (and may crash) when using nested remote functions."

Recently, I ran into a ray crash issue and found Ray issue #3644 on github “Ray starts too many workers (and may crash) when using nested remote functions.”

wondering if any plan to fix it in some future release?
thank you!

When you run a nested task and call ray.get on it, Ray starts new workers (imagine it doesn’t. Then your application will be deadlocking because your nested task will never run, and you are calling ray.get on that task). I believe you can mitigate the issue if you have enough resources in the cluster.

@sangcho, thanks for your generous reply! Yes, i got the walk-around from the discussion on #3644. i am wondering whether it is planned to be fixed in a future Ray release.

I am not sure, but my opinion here is this is a feature, not a bug. Because without having this feature, you will have another issue, a deadlock, that might be harder to debug.

@darkwhale, can you share the workload that caused the crash (ideally a minimal runnable script)?