RAY Dashboad Grafana Embedding ERROR

1. Severity of the issue: (select one)
None: I’m just curious or want clarification.
Low: Annoying but doesn’t hinder my work.
Medium: Significantly affects my productivity but can find a workaround (e.g., using standalone Grafana).
2. Environment:

  • Ray version: 2.46.0
  • Python version: 3.10.12
  • OS: Ubuntu 22.04
  • Other libs/tools (if relevant): Grafana 12.0.0

3. Repro steps / sample code:

  1. Set up Ray with Grafana integration, intending to use the official Ray-provided monitoring dashboards.
  2. The Ray Dashboard is configured to embed the official Grafana dashboard, which appears to be identified by dashboardUID=rayDefaultDashboard.
  3. When this rayDefaultDashboard is inspected directly within Grafana:
    a. It utilizes a dashboard variable named datasource, which is defined with a constant value of “prometheus”. This “prometheus” refers to a configured Grafana data source pointing to our Prometheus instance.
    b. Panels within this dashboard are configured to use ${datasource} as their data source.
    c. The queries within these panels are standard PromQL queries (e.g., sum(increase(ray_tasks{...}[...])) by (State)), appropriate for a Prometheus data source.
  4. Access the Ray Dashboard UI section where these Grafana panels from the official rayDefaultDashboard are embedded.
  5. Open browser developer tools and inspect the network requests made by the embedded Grafana panels.

4. What happened vs. what you expected:

  • Expected:

    • The official Ray-provided Grafana dashboard (rayDefaultDashboard), when embedded in the Ray Dashboard, should use its configured data source (resolved from the ${datasource} variable to “prometheus”).
    • Network requests for panel data should be made to Grafana’s /api/ds/query endpoint, with PromQL queries targeting the Prometheus data source, as configured within the dashboard’s panels.
    • Panels should display metric data correctly, reflecting the Ray cluster’s state.
  • Actual:

    • All embedded Grafana panels from the official rayDefaultDashboard in the Ray Dashboard consistently make network requests to Grafana’s /api/annotations?from=...&to=...&limit=100&matchAny=false&dashboardUID=rayDefaultDashboard endpoint.
    • As a result, these official monitoring panels show “no data” or do not display the expected metrics.
    • Accessing the same rayDefaultDashboard directly in Grafana (not embedded in Ray) works as expected: panels use the “prometheus” data source, PromQL queries are issued via /api/ds/query, and data is displayed correctly.
    • grafana.ini settings for embedding and cookie handling appear correct.

The core issue is that Ray’s official monitoring dashboard (rayDefaultDashboard), which seems correctly configured for PromQL queries against Prometheus when viewed directly in Grafana, is not functioning as expected when embedded in the Ray Dashboard. Instead of utilizing Grafana’s standard data querying mechanism (/api/ds/query), the Ray Dashboard appears to force all data requests for this official dashboard through Grafana’s /api/annotations endpoint. This behavior is unexpected for an official, out-of-the-box monitoring asset and suggests a potential bug in Ray’s dashboard embedding logic or a compatibility issue with the current Grafana version.

I wasn’t able to repro this on grafana 11 or grafana 12.

I do see calls to /api/annotations but I also see calls to /api/query that fetches the actual data.

If this is only not working with embeds, it could be related to networking settings or browser security settings.

Are you running grafana locally on the same machine you are running the ray cluster? Are you visiting grafana from that same machine as well?