How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
Hi,
I have a problem when canceling tasks.
Description
I have some tasks that I want to run and then cancel in case they run too long (canceling is done dynamically up indication of another task). The tasks them self call subprocess.Popen() to execute some external script. When canceling the tasks with ray cancel, the tasks only get canceled after the subprocess in done instead of also canceling the subprocess. I.e the keyboard interrupt only gets invoked after the task that should be killed is done. I thought about working with signals and then killing the subprocess up on that but somehow I can not catch the ray cancel signal within the task. Below I provided an example. I exchanged the subprocess through a sleep since I am experiencing the same behavior in this case.
Code
import ray
import time
import random
import math
@ray.remote(num_cpus=1)
def long_process(sleep):
try:
print(f"Starting {time.ctime()}, {sleep}")
time.sleep(sleep)
print(f"Finished {time.ctime()}, {sleep}")
except KeyboardInterrupt:
print(f"Interrupted {time.ctime()}, {sleep}")
@ray.remote(num_cpus=1)
def monitor(sleep, tasks_kill):
time.sleep(sleep)
[ray.cancel(t) for t in tasks_kill ]
print("Killing processes at:", time.ctime())
return sleep
rtime = [10, 15]
long_tasks = [long_process.remote(rt) for rt in rtime]
monitor = monitor.remote(5,long_tasks)
run = long_tasks + [monitor]
ray.get(run)
Output:
(long_process pid=85820) Starting Sun Mar 20 12:28:17 2022, 10
(long_process pid=85821) Starting Sun Mar 20 12:28:17 2022, 15
(monitor pid=85819) Killing processes at: Sun Mar 20 12:28:22 2022
(long_process pid=85820) Interrupted Sun Mar 20 12:28:27 2022, 10
(long_process pid=85821) Interrupted Sun Mar 20 12:28:32 2022, 15
Problem:
The sleep should be interrupted (canceled) and the process end immediately. As you can see the keyboard interrupted is only handled after the sleep finishes. Is there someway to catch the keyboard interrupt and then gracefully shut the subprocess (sleep) down and return within the long process?
Many thanks