Doesn't work Ray client on k8s with Ingress

Hello Ray team! I’d like to ask something about ‘Ray Client’
I want to connect between Ray Client and Ray Cluster using k8s Ingress.

But It doesn’t work. The next error message was occuerred.

ERROR:ray.util.client.server.proxier:Client connecting with no client_id

I tested with the next Ingress configuration.

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: xxxx
  namespace: xxxx
  annotations:
    kubernetes.io/ingress.class: nginx-10g
    nginx.ingress.kubernetes.io/backend-protocol: "GRPC"
spec:
  rules:
    - host: example.com
      http:
        paths:
          - path: /
            backend:
              serviceName: [target ray cluster service name]
              servicePort: 10001
  tls:
    - secretName: [tls secret name]
      hosts:
        - example.com

and I tested with the next python code.

from ray.util.client.worker import Worker

worker = Worker("example.com:443", secure=True, metadata=[("client_id", "test_client")])

When I tested on the local machine or using kubectl port forward, It worked well.
But it didn’t works when I tested with Ingress.

Test environment

  • ray==1.4.0
  • python==3.6
  • grpcio==1.32.0

cc @Dmitri can you address the question? K8s + ray client.

Hi @77loopin !

To make sure the problem persists with a more recent Ray version, could you
(1) Try it with Ray 1.5.1?

The issue is most likely a Ray bug and not an issue with your networking setup but
(2) have you tried using an identical ingress setup to expose other grpc (perhaps simpler) services?

cc @ijrsvt for the ray client part of this

Hi @Dmitri :slight_smile:

(1) Yes, I tried. But It also didn’t work :frowning:

When I created a worker, as I wrote in my post, the next debug message was printed.

DEBUG:ray.util.client.worker:client gRPC channel state change: ChannelConnectivity.IDLE
DEBUG:ray.util.client.worker:client gRPC channel state change: ChannelConnectivity.CONNECTING
DEBUG:ray.util.client.worker:client gRPC channel state change: ChannelConnectivity.READY
DEBUG:ray.util.client.worker:Pinging server.

It works the PING test using ray.rpc.RayletDriver/ClusterInfo. But other grpc functions doesn’t work because there is no client id.

(2) Yes, other grpc services works well.

Hi @Dmitri

I resolved this issue.

According to configuration of ingress, Ingress Controller remove the invalid header on gRPC protocol. So they remove the ‘client_id’ header in the request which is sent from Ray Client.
I found the solution to resolve this issue here.

I resolved by adding the following Ingress annotation.

metadata:
  annotations:
    nginx.ingress.kubernetes.io/server-snippet: |
      underscores_in_headers on;
      ignore_invalid_headers on;

I think it’d better to add the guide this issue on Ray Client guide.
I sent the PR adding the Ray Client guide. ([RayClient] Add the guide for k8s Ingress by 77loopin · Pull Request #17736 · ray-project/ray · GitHub)