Killing driver does not kill tasks in Ray on minikube

Hello,

I use sample program available from one of the ray tutorial and run below code snippet on the ray head node CLI:

@ray.remote
def f():
time.sleep(2)

#call ray init

result = ray.get(f.remote for _ in range 1500)

on head node CLI when I execute the above command a demand of 1500 CPUs is created. In my minikube env I do not have 1500 CPUs, so I think tasks are executed in ray defined order.

When I exit the CLI command by ctrl+Z, while the 1500 tasks are completing I was expecting that the remaining tasks should have been terminated but that is not the case. The tasks keep on executing even when I have killed the CLI command.

When I resubmit 1500 tasks again from CLI, I am not sure in what order the tasks are executed, will ray finish my killed CLI task first before executing the new tasks?

Can you please comment if this is expected?

Sounds like a bug
@sangcho exiting the driver should cancel pending tasks, right?

Did you call ray.init() or ray.init(address=‘auto’)?

I think I called ray.init(address=‘auto’)

The tasks keep on executing even when I have killed the CLI command.

Does that mean tasks kept executing until all of them are finished?

My expectation is that when the driver exits, pending tasks shouldn’t be executed. cc @Alex Can you follow up if my understanding is correct?

Yes it kept executing even when the driver was killed

@asm582 I think the issue here is that you are intending to kill the program, but you’re actually stopping it instead.

In general, ctrl+z sends a SIGSTOP to the program, which only pauses it, so the tasks remain because you could always unpause the program by running kill -SIGCONT <pid>.

I think what you’re looking for is ctrl+c which will actually kill the process and trigger the cleanup (SIGTERM).

1 Like

Thank you, I now confirm with SIGTERM the tasks are killed