Use nsys to profile ray program

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I try to use nsys feature with ray 2.9.0 version, following is my code:

import torch
import ray

ray.init()


@ray.remote(
    num_gpus=1,
    runtime_env={
        "nsight": {
            "t": "cuda,cudnn,cublas,nvtx",
            "o": "'worker_process_%p'",
            "cudabacktrace": "all",
        }
    },
)
class RayActor:
    def run(self):
        a = torch.randint(0, 2, [128, 2, 2048, 2048]).cuda()
        b = torch.randint(0, 2, [128, 2, 2048, 2048]).cuda()
        for i in range(10):
            c = a * b
        print("Result on GPU:", c)


ray_actor = RayActor.remote()
# The Actor or Task process runs with : "nsys profile [default options] ..."
ray.get(ray_actor.run.remote())

But I can’t capture any useful information, nsys generate .qdrep file like follwing:

Anyone can help me?