How severe does this issue affect your experience of using Ray?
Medium: It contributes to significant difficulty to complete my task, but I can work around it.
We log all of our exceptions to sentry currently. This is setup through a python site wide exception handler. Exceptions which occur inside ray agents bypass this handler and just log unhandled exceptions internally. There is a flag RAY_IGNORE_UNHANDLED_ERRORS but this only ignores the exceptions, it doesn’t let the exceptions bubble up to the site wide exception handler. I’ve tried injecting a handler into ray._private.worker._unhandled_error_handler, but it seems to be ignored.
What is the best way to use my own custom handler for unhandled exceptions?
Yes, that is the short term solution I am using, but the reason we are using site-wide exception handling is so that even if a mistake is made and an exception is not handled, for example, a ray actor method is not wrapped with the exception handler, it is still detected and logged.
In python, replacing sys.excepthook is the standard way this is done:
but I understand if ray doesn’t want to let exceptions crash the main process. Instead, it would be nice to be able to provide an exception handler of last resort to maximize the chances that an exception is detected.
I see, I don’t think there is a way to disable Ray’s exception wrapping.
The closest thing I can think of is to write your own wrapper classes / function decorators for any Ray actor / task definitions to bubble up exceptions before propagating the exception to Ray. But yes, it is up to you to make sure that the wrappers are always used.