Ray head memory leak in 1.13?

hi ray team, I’m running into a potential memory leak issue on ray head node in latest ray 1.13. It’s very similar to this discussion: Memory leak in ray head

  1. I’m using ray 1.13 and deploy ray on a k8s cluster

  2. head node’s memory keeps increasing (slowly) and in about 4 days it goes out of memory (3GB under my setting). Interestingly during most time of this period, the grid is idle and no task was sent to the grid. Here’s the dashboard screenshot when it’s going to OOM:

  3. Here’s the top command from head node:

  4. I also tried ray memory, but I can’t see any red herring from it.


Any help would be appreciated and I’d like to provide more information.

Thansk,
-BS

2 Likes

Could anyone shed any light on this? It’s a blocker on our production rollout.

Thanks a lot

-BS

From the top command the python process seems using a lot of memory (the third one). Do you know what commas is it running?

It’s the dashboard process as far as I remember.

@blshao84 I see, most likely you are hitting the dashboard memory leak problem. We have a P0 issue for that, which should be resolved in the coming weeks.

1 Like

Thanks, good to know it will be fixed soon