Environment:
- Ray version: 2.48.0
- Python version: 3.12
- OS: Linux
- Cloud/Infrastructure: GCP
- Other libs/tools (if relevant):
I am working on distributed training on ray using TorchTrainer and TensorflowTrainer.I am trying to get low level metrics/insights around what all tasks are getting launched and how much time each tasks inside the this is getting executed.I am able to get the ray level task breakdown through timeline view/ list tasks(using API). Currently I am not using ray data for data preprocessing.Attaching the screenshot of the timeline view of the tasks.For pytorch/tensorflow training, i want to get tasks level view inside the Tensorflow Train/ Pytorch trainer task like data loading time, take taken for each epoch/ batch level tasks.Does ray get information about what all tasks are getting launched inside tensorflow/pytorch train. Or any idea how can i get this information about these all metrics around lower level details ?