What is best practice to keep Actors/Workers running forever and have Ray restart them if they are not responding. Is best practice to use something like. " while True: someFunc() " to keep the process up?
What is best practice to keep Actors/Workers running forever and have Ray restart them if they are not responding
You can use this API; Fault Tolerance — Ray 3.0.0.dev0 ray.remote(max_restarts=-1)
while True: someFunc() " to keep the process up?
No this is not a recommended pattern. It is because ray actors are by default single threaded, so if you have this sort of code, actor won’t be able to receive any more message. If you would like to use this type of “background work” you can use the second thread, or async actor;
@ray.remote
class A:
    def process_msg(self):
         await # do something
 
    def run(self):
        while True:
            await asyncio.sleep(0)
Also, actors are running forever until it is crashed by default (and it is waiting to receive messages).
My crashed Actors are not being restarted by Ray.
What is recommended way to sys.exit() or quit() such that Actor can force itself to be restarted by Ray?
Can you tell me how you initiated actor with the code example?