Ray k8s cluster, cannot run new task when previous task failed

Sounds good. Thanks for your feedback.

Happy to hear that the problem is solved, and thanks for writing the summary for future users!

I’d still love to understand the SCHEDULING_CANCELLED_RUNTIME_ENV_SETUP_FAILED , even though you’re correct that runtime_env is not needed in your case. Ideally it should never fail, or it should at least print a useful error message. I made a mistake earlier when talking about the runtime_env logs; they would be on the node where the failed actor would be placed, not necessarily the head node. Anyways, no obligation to continue with this as you’re probably eager to move on!

2 Likes