I’m currently working on a project using Ray Serve and I’m interested in setting a limit on the number of retries for deployments that fail (not replicas), I’m unable to find a direct method to restrict how many times a deployment should retry upon failure.
Could anyone provide insights or best practices on how to implement retry limits for deployment failures in Ray Serve? Are there any patterns or custom logic that you recommend integrating into the deployment code to achieve this?
Any guidance or examples would be greatly appreciated!