Worker logs are sent to multiple clients

Hello, we are using ray cluster to optimize resource usage in kubernetes cluster. We have lots of “client” pods scheduling jobs to one shared cluster. Cluster is configured to autoscale as much as possible. Usually every job gets its own worker. Problem is that ray is mixing logs from all jobs together. All client pods are running the same sources but with different configuration.

i.e. client pod X sends jobs A and B to cluster and client pod Y sends jobs C and D to cluster. All run at the same time. X is receiving logs from A and B jobs but also some of the logs created by C and D jobs scheduled by client Y. Y is also receiving logs from all jobs. Those jobs are scheduled on different worker pods.

Is there any configuration option we have missed to prevent this behavior? We are running ray v 1.5.2.

Hey @kubav how are these client pods connecting in? Are they using ray.client (i.e. the gRPC Ray Client)?

Client is calling “ray.init” and jobs are just one function annotated with “@ray.remote”.

@kubav are you passing an address in ray.init()?

Yes, we have already deployed about 20 client pods. Each client pod is sending hundreds of jobs a day. We are using autoscaler because jobs are not being queued the whole day (worker pod count is between 1 - 150). Everything is working quite good except logs from different jobs are mixed together.

Init looks like:

ray.init(
  address=f'ray://{address}:{port}',
  runtime_env={'working_dir': '/path/to/sources'},
  namespace=namespace
)

Sending job looks like this:

 remote_func.options(
                    name=job_name,
                    resources={'worker1': 1}, # choose worker type
                ).remote(params)