I deployed a ray cluster on Kubernetes using kuberay and I want to monitor the cluster using prometheus metrics. After reading ray document, I know that there is service discovery file is generated on the head node /tmp/ray/prom_metrics_service_discovery.json. Using the below Prometheus config, Prometheus will automatically update the addresses that it scrapes based on the contents of Ray’s service discovery file.
# Prometheus config file
# my global config
# Scrape from Ray.
- job_name: 'ray'
But since I am using Kubernetes, based on humble experience, I think the most convenient way to configure Prometheus to scape the ray metrics should be exposing metrics configuration on service annotations like this: