Network I/O monitoring per ray job/task level

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I would like to know if there are any metrics that ray provides to monitor job/task level network I/O. I am using Ray tasks to process video streams and would be very beneficial to have such metrics.

If there is no such metrics, it will be great if you could direct me to other alternatives that I might be able to look into to achieve similar level of monitoring. Thank you in advance!

Check out this: Collecting and monitoring metrics — Ray 2.9.3

I don’t think there are out-of-box metrics for your use cases. You can report custom application metrics by yourself.

Thank you for the reply.

I did read the link that you’ve posted and had ray dashboard set up with my existing grafana and prometheus. But further on to the plain metrics, I was wondering if there is any information on how to add additional metric panels to the ray dashboard (like the custom application metrics that you’ve suggested). I don’t seem to find such information on the page that you gave me or on other ray docs.

Thank you again!

Best,
Claire

You want to customize the UI/embedded grafana graphs in Ray Dashboard?
If so, it’s not supported out of the box. You need to customize the Ray code (probably here ray/dashboard/client/src/pages/metrics/Metrics.tsx at master · ray-project/ray · GitHub) and compile and build the UI by yourself.

Another option is to not rely on ray dashboard, use Grafana with your custom dashboards instead.

Thank you for the reply. I will look into adding custom metrics and using Grafana for monitoring :slight_smile: