I use ray inside process, so I assume it would clean up /dev/shm on shutdown. However, after 5 iterations it runs out of memory and crashes.
Here is high-level version of the code:
for i in range(10):
proc = Process(
target=func,
args=(a,b,...
),
)
proc.start()
proc.join()
Inside func() ray.init() is called and ray.shutdown()
After 5 iterations of the above loop it runs out of memory, refuses to spill since /tmp/ray is 95% full (message) and crashes. There is plenty of RAM since process returns memory on join(). So it must be that /dev/shm gets full and is not released. Is that a correct conclusion?
Is the solution to increase memory on init() ?
Also, I don’t understand why /tmp/ray gets 95% full on 5th iteration. This happens even if I explicitly recursively delete /tmp/ray using shutil after each proc.join().
There is nothing wrong with 5th chunk because if it is the first chunk it goes through fine.
Thank you!