How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
I have a weird issue which seems to me it is a memory leak but I am not sure what exactly is the source. For some reason, there is memory spike across some GPUs as can be seen in the following picture during training and this will lead to OOM at some point. The interesting part is that initially all the gpus have even memory utilization but through time this is increasing and some might have more than the others. Any idea how I can investigate and find the source and potentially resolve the issue?