Embedding Grafana visualizations into Ray Dashboard

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hi!

I’ve set up a Ray head node in Docker but am running into some issues with embedding Grafana visualizations into the dashboard. I have separate servers for both Prometheus and Grafana. As far as I can tell these are working as expected, and I can see metrics about the head node in Grafana. The dashboard also seems to be working, except the metrics section says “Set up Prometheus and Grafana for better Ray Dashboard experience”

Some things I’ve done:

  • Set RAY_GRAFANA_HOST with protocol/no trailing slash
  • Set RAY_PROMETHEUS_HOST with protocol/no trailing slash
  • Set RAY_GRAFANA_IFRAME_HOST (Might have done this one incorrectly, I set it to my-grafana-host/d/rayDefaultDashboard/dashboard-name)
  • In Grafana, under [security] I’ve set allow_embedding/cookie_secure to true, and cookie_samesite to none
  • Enabled anonymous access (is there a way to pass credentials from Ray to Grafana?)

When I go to the /api/grafana_health endpoint, it returns a 200 response and says “Grafana running” with the correct grafanaHost. However, for the dashboardUids section, I don’t see my dashboard, maybe that’s my issue? Is there an environment variable or something I can set to change this? Or maybe I’m missing a step somewhere?

Also ran grep -r 'grafana' /tmp/ray/sesion_latest/logs/, but that didn’t provide anything useful either.

Thank you!

Maybe take a look at this PR which is trying to improve doc about this: polish observability (o11y) docs by scottsun94 · Pull Request #39069 · ray-project/ray · GitHub

This one doesn’t seem right. It should be the IP where your browser (and the underlying machine) can reach the grafana server.

Blockquote
This one doesn’t seem right. It should be the IP where your browser (and the underlying machine) can reach the grafana server.

So that would be IP of RAY_GRAFANA_HOST right? It didn’t seem to like that either :slightly_frowning_face:

Yeah. As long as your browser can reach it. Can you try access IP of RAY_GRAFANA_HOST directly in your browser?

Yup that seems to work. Going onto the Docker container Ray is running on and curling the Grafana server works as well.

So both the grafana and prometheus healthchecks passed for you?


Oh that might be it! It was failing cause I had a self-signed certificate, but I fixed that, and now it’s failing because I have authentication turned on for Prometheus. Is there a way to pass credentials or some kind of token to Ray?

Hmm I’m not sure. Can you try turning off the auth for Prometheus first and see if it works?

Another thing to note, if you use an existing grafana, you may need to import Ray-provided dashboard JSON into it first: find them after you start Ray cluster at /tmp/ray/session_latest/metrics/grafana/dashboards and copy the JSONs over and import the Grafana dashboards

Turning off authentication for Prometheus did the trick. It would be nice to have authentication turned on but at least the visualizations are being embedded, thanks!

Great to know that.

Any idea how we can support it? How can Ray Head Node automatically authenticate to Prometheus in this case? With some credential files, env var or?

cc: @aguo

  • Enabled anonymous access (is there a way to pass credentials from Ray to Grafana?)

Can you please tell me if you have found a solution without enabling anonymous mode?

Unfortunately I didn’t :confused:

@Huaiwei_Sun

I have an existing grafana which is hosted in a separate AKS. Is there some way to provide authentication details which are required in my existing Grafana instance ?

If not, what are my alternatives if I want to use the same Grafana instance ?
Currently I get the “refused to connect” error in my Ray dashboard.

1 Like

Unfortunately I don’t have a good answer to this. I don’t think it currently supports this kind of authentication. Will let @Sam_Chan @aguo confirm.

Unfortunately, i don’t think Grafana supports embedding with authentication: How to embed Grafana dashboards into web applications | Grafana Labs

They list out a few reasons there but the primary one is that having to separately authenticate can be a poor experience.

For querying prometheus via authentication, I think passing in an API key via env vars would make sense. We’re happy to accept PRs adding this functionality, but it may not be that useful without Grafana supporting authentication.