For example, suppose I want to replace an actor (both of which use a GPU) and I want to make sure the first actor is fully dead and no longer using memory on the GPU.
Note that you can do
ray.kill returns before the actor process finishes exiting. You can check with the following script.
import os import ray ray.init() @ray.remote class Actor: def get_pid(self): return os.getpid() a = Actor.remote() pid = ray.get(a.get_pid.remote()) ray.kill(a) os.kill(pid, 0) # Raises exception if process is dead, otherwise nothing
cc @yic is this something you are aware of?
Actor killing right now is async and we don’t have the option to make it sync. One workaround for this one is that you can have a remote task in the actor to clear all the GPU resources used by this actor first and then kill it and start a new one. In this way, you are sure no actor is using the resource when the new one is creating.
If you think sync exiting is an important feature, you can submit an issue for this feature.