Are the statistics on Ray dashboard (CPU util, memory, network) programmatically available via some API? I want to track metrics of my cluster and view it as a line plot. I can write my own data collector using psutil
but I wonder since the information is available in the Ray GUI dashboard, I can somehow get it?
Maybe @eoakes can provide some guidance here?
Ray exports metrics on each node at a specified port in prometheus format. You can scrape these metrics and store them – the easiest way is by using prometheus and visualizing using grafana. Docs here: Ray Monitoring — Ray v1.1.0