ray::IDLE take lot of memory and cluster util down to half. Ray Data

Hi, team. I using ray data with tow operators, one is CPU task and another is GPU task.

Below is the grafana chart. After the job running for several hours. The memory of ray::IDLE suddenly raise to a very high level. And the utilization down to half.

I can not find any useful info in the log and dashboard. How should I debug this issue?

Thanks very much.