[RLlib] Ray Out Of Memory Error

Hi, when I run training episodes by executing this script:

“python3 dqn_train.py dqn_example/dqn_config.yaml --name dqn --overwrite”

and using

Ray version: 1.1.0

Machine: intel core i7, GEFORCE RTX, 16 G RAM

OS: Ubuntu 20.04

I’m getting the below memory error. How can I fix it?

ray.memory_monitor.RayOutOfMemoryError: More than 95% of the memory on node yesmina-HP-Pavilion-Gaming-Laptop-16-a0xxx is used (14.78 / 15.41 GB). The top 10 memory consumers are:

76444 5.42GiB ray::CustomDQNTrainer.train()
76449 2.57GiB ray::RolloutWorker
76675 1.52GiB /home/yesmina/CARLA_0.9.11/CarlaUE4/Binaries/Linux/CarlaUE4-Linux-Shipping CarlaUE4 -OpenGL -windowe
76283 1.42GiB /home/yesmina/CARLA_0.9.11/CarlaUE4/Binaries/Linux/CarlaUE4-Linux-Shipping CarlaUE4 -OpenGL
78287 0.25GiB /opt/google/chrome/chrome --type=renderer --field-trial-handle=8221191988494462348,12935930956483335
43045 0.16GiB /opt/google/chrome/chrome --type=gpu-process --field-trial-handle=8221191988494462348,12935930956483
2285 0.16GiB /usr/share/teams/teams --type=renderer --autoplay-policy=no-user-gesture-required --disable-backgrou
76358 0.14GiB python3 dqn_train.py dqn_example/dqn_config.yaml --name dqn --overwrite
5905 0.13GiB /opt/google/chrome/chrome
29591 0.13GiB /opt/google/chrome/chrome --type=renderer --field-trial-handle=8221191988494462348,12935930956483335

1 Like

Hey @Yasmina_Jaafra, thanks for posting this! Could you check the memory on your machine and the value of your DQN buffer_size config parameter (it’s 50000 by default)? Maybe you are just filling up the replay buffer with samples from the env and your machine cannot hold that many.
Also, what’s your observation space? If it’s a large (e.g. image-based) space, that’ll happen more easily.

1 Like

Thank you for your assistance. I have 16 Go of RAM. The buffer size parameter is set to 50000 and I am applying CNN to CARLA simulator. I reduced the buffer size and the problem was fixed. Is there an optimal value of this parameter?