Here is the ram usage outside of docker ( 2 workers )…
The line seems to be fairly flat and, to me, this seems to signal that training inside of a docker container may be causing the memory leak .
Here is the other thread that referenced docker containers and linux cgroups related to a memory leak: Help debugging a memory leak in rllib
I would also like to note that I don’t think there was a memory leak when using a single worker ( and no gpu ) inside of a docker container. I will go back and check that out.