I am running Ray 2.3.1 on my Mac Pro. I also have Grafana and Prometheus running on this machine. I have verified that both are working by checking localhost:3000 and localhost:9090, respectively. I launch a local Ray cluster like so
export RAY_GRAFANA_HOST=http://127.0.0.1:3000
export RAY_PROMETHEUS_HOST=http://127.0.0.1:9090
ray start --head
Ray starts. The Ray monitor at 127.0.0.1:8265 shows broken cluster monitoring windows. The screen looks like this
If I hover over one of the windows I see the message “127.0.0.1 refused to connect”.
The Ray cluster itself works correctly, as does the Recent jobs tab of the monitor.
I have tried adding export RAY_GRAFANA_IFRAME_HOST=http://127.0.0.1:3000, as well as not setting any of these environment variables, and see the same result.
I watched the web traffic with Chrome developer tools while refreshing the Ray monitor web page. The following things looked wrong:
Two calls to roboto-latin.500 on the Ray monitor port failed with the message “Failed to load response data. No data found for resource for given identifier” in the Response tab.
Two calls to default-dashboard?... on the Grafana port showed the message “Failed to load response data: No content available because this request was redirected” in the Response tab.
Two calls to login on the Grafana port showed the message “Failed to load response data. No resource with the given identifier found” in the Response tab.
How do I get Grafana and Prometheus to integrate with Ray?
To the best of my knowledge I followed the documentation instructions you linked to correctly.
I’ll try running ray dashboard <cluster config file>, but I don’t know where my cluster config file is. I’m just having Ray create a local cluster by default.
That worked. The Cluster Utilization and Node Count windows now display data.
For reference of anybody else who hits this, here is exactly how I made this work on my Mac.
brew install grafana
brew install prometheus
Change the --config-file line in /usr/local/etc/prometheus.args to read --config.file /tmp/ray/session_latest/metrics/prometheus/prometheus.yml.
Uncomment the appropriate lines in /usr/local/etc/grafana/grafana.ini so that it matches the contents of /tmp/ray/session_latest/metrics/grafana/grafana.ini.