Thanks for your response @aguo . However I’m seeing several metrics that do not have this label. Some example are pasted below from a test Ray cluster.
Yes, applying additional labels is a good idea. I’m exploring that with Prometheus metric relabeling feature.
> ray_heartbeat_report_ms_bucket{Component="raylet",JobId="",NodeAddress="10.86.49.51",Version="3.0.0.dev0",WorkerId="",le="+Inf"} 389.0
>
> ray_heartbeat_report_ms_count{Component="raylet",JobId="",NodeAddress="10.86.49.51",Version="3.0.0.dev0",WorkerId=""} 389.0
>
> ray_heartbeat_report_ms_sum{Component="raylet",JobId="",NodeAddress="10.86.49.51",Version="3.0.0.dev0",WorkerId=""} 388084.00000000006
>
> # HELP ray_internal_num_spilled_tasks The cumulative number of lease requeusts that this raylet has spilled to other raylets.
>
> # TYPE ray_internal_num_spilled_tasks gauge
>
> ray_internal_num_spilled_tasks{Component="raylet",JobId="",NodeAddress="10.86.49.51",Version="3.0.0.dev0",WorkerId=""} 0.0
>
> # HELP ray_internal_num_processes_started The total number of worker processes the worker pool has created.
>
> # TYPE ray_internal_num_processes_started gauge
>
> ray_internal_num_processes_started{Component="raylet",JobId="",NodeAddress="10.86.49.51",Version="3.0.0.dev0",WorkerId=""} 1.0
>
> # HELP ray_internal_num_infeasible_scheduling_classes The number of unique scheduling classes that are infeasible.
>
> # TYPE ray_internal_num_infeasible_scheduling_classes gauge
>
> ray_internal_num_infeasible_scheduling_classes{Component="raylet",JobId="",NodeAddress="10.86.49.51",Version="3.0.0.dev0",WorkerId=""} 0.0
>
> # HELP ray_pull_manager_requests Number of pull requests broken per type {Queued, Active, Pinned}.
>
> # TYPE ray_pull_manager_requests gauge
>
> ray_pull_manager_requests{Component="raylet",JobId="",NodeAddress="10.86.49.51",Type="Queued",Version="3.0.0.dev0",WorkerId=""} 0.0
>
> # HELP ray_pull_manager_requests Number of pull requests broken per type {Queued, Active, Pinned}.
>
> # TYPE ray_pull_manager_requests gauge
>
> ray_pull_manager_requests{Component="raylet",JobId="",NodeAddress="10.86.49.51",Type="Active",Version="3.0.0.dev0",WorkerId=""} 0.0