When I use “ray job submit” to submit my job to a ray cluster, I ues prometheus to collect the default Metrics.
The cluster will collect the cpu and memory usage of nodes and component at the running time.
However, I can not select the cpu and memory usage of a specific tasks/job.
Can you tell that if the ray support to collect those usage infomation at job running time?
In other word, I want to collect those usage infomation of a job, how can I achive this?
If not, can I calculate a approximate cpu/memory usage infomation from other collected metrics (eg, actor or component metrics)?