How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
I wonder if there is a way to kill the tasks early when seeing an exception in the ray DAG execution. For example, in the following code snippet, the ith task (from function f) will do nothing but only sleep for i seconds, except for the 3rd task will raise an exception after sleeping. So ideally, what I want to see is the exception from the 3rd task will be raised after roughly 3 seconds, and this exception can kill all the remaining tasks. However, in reality, the exception will only be raised after ~10s, and this could waste a lot of computational resource in real world. Does anyone happen to have an idea on how I could achieve what I want? Thanks in advance.
import ray
from time import sleep
@ray.remote
def f(i):
sleep(i)
if i == 3:
raise ValueError
@ray.remote
def g(*args):
pass
xs = [f.bind(i) for i in range(10)]
y = g.bind(*xs)
ray.get(y.execute()