Ray Monitor Not Connecting to Grafana and Prometheus

I am running Ray 2.3.1 on my Mac Pro. I also have Grafana and Prometheus running on this machine. I have verified that both are working by checking localhost:3000 and localhost:9090, respectively. I launch a local Ray cluster like so

ray start --head

Ray starts. The Ray monitor at shows broken cluster monitoring windows. The screen looks like this

If I hover over one of the windows I see the message “ refused to connect”.

The Ray cluster itself works correctly, as does the Recent jobs tab of the monitor.

I have tried adding export RAY_GRAFANA_IFRAME_HOST=, as well as not setting any of these environment variables, and see the same result.

I watched the web traffic with Chrome developer tools while refreshing the Ray monitor web page. The following things looked wrong:

  • Two calls to roboto-latin.500 on the Ray monitor port failed with the message “Failed to load response data. No data found for resource for given identifier” in the Response tab.
  • Two calls to default-dashboard?... on the Grafana port showed the message “Failed to load response data: No content available because this request was redirected” in the Response tab.
  • Two calls to login on the Grafana port showed the message “Failed to load response data. No resource with the given identifier found” in the Response tab.

How do I get Grafana and Prometheus to integrate with Ray?

1 Like

@rickyyx @sangcho Do we have specific instructions how to install Grafana and Prometheus on local host and how Ray dashboard can discover its configs?

Does @wpm have to use the dashboard command: ray dashboard [-p <port, 8265 by default>] <cluster config file>

To the best of my knowledge I followed the documentation instructions you linked to correctly.

I’ll try running ray dashboard <cluster config file>, but I don’t know where my cluster config file is. I’m just having Ray create a local cluster by default.

ray dashboard is not needed for a local ray cluster. Hmm. I just tried to set those up on my macbook pro, it worked fine.

  • Grafana 9.4.7
  • Prometheus 2.43.0
  • Grafana started with brew services start grafana
  • Prometheus started with docker run -p 9090:9090 prom/prometheus
  • I don’t think I have any dashboards in Grafana. I poked around the UI and didn’t see anything.

I have a dashboards page that looks like this

According to: Metrics — Ray 2.3.1

You need to start prometheus and grafana with the config files provided by Ray so that:

  • prometheus can scrap the metrics from the ray cluster properly
  • grafana can talk to the prometheus and visualize the metrics with the template dashboard provided by Ray

Can you give it a try?

That worked. The Cluster Utilization and Node Count windows now display data.

For reference of anybody else who hits this, here is exactly how I made this work on my Mac.

  1. brew install grafana

  2. brew install prometheus

  3. Change the --config-file line in /usr/local/etc/prometheus.args to read --config.file /tmp/ray/session_latest/metrics/prometheus/prometheus.yml.

  4. Uncomment the appropriate lines in /usr/local/etc/grafana/grafana.ini so that it matches the contents of /tmp/ray/session_latest/metrics/grafana/grafana.ini.

  5. brew services start grafana

  6. brew services start prometheus

  7. ray start --head

Thanks for your help.

1 Like

Glad that it works out!

@aguo We probably should add some guides for homebrew-based workflows ^. Added it to our backlog.