High: It blocks me to complete my task.
I am trying to do distributed LLM model inferencing using vllm and ray on Openshift. I tried doing this using RayService API on Openshift following this guide Serve a Large Language Model with vLLM on Kubernetes — Ray 2.43.0. However I kept running into the following issue when inspecting events:
pods "llama-3-8b-raycluster-9kkt2-head-" is forbidden: error looking up service account ray-serve/llama-3-8b-raycluster-9kkt2-oauth-proxy-3c567d11: serviceaccount "llama-3-8b-raycluster-9kkt2-oauth-proxy-3c567d11" not found
The serviceaccount kept automatically creates and delete immediately, even for the ones I created manually. Note this this deployment worked in vanilla kubernetes I’ve done before, just not working in Openshift. Are there any privilege settings I have to provide this deployment for this to work? Appreciate any advice on this. Thank you