Ray trainer CPU memory blowup

Hi there

I am wrapping previous lightning code into ray dataloader and ray trainer. The problem I have is whenever the trainer starts to do online validation (used val_check_interval in lightning trainer). The memory and object store will starts to blowup. I have attached the grafana image for ref, online validation starts from 20:35 to 20:45. More on the setup, I am using a iteration based trainer and I have used the same configurations for training and validation dataloader, same concurrency, block_size, etc.

What is the problem here? is train dataloader continuous pouring data even when we are doing validation?

That is odd.
A simple reproduction could definitely help here, if possible.

(Ideally with code and cluster configuration, so that one could run it.)