Cutting of Pending Trials due to high CPU RAM utilization


I am running some experiments with Tune on 16 CPU cores and 1 GPU machine and configured --cpus-per-trial 5 and --gpus-per-trial 0.5. I am obliged to set half of the GPU available for each trial since I notice a severe CPU RAM overhead because of the ) call leaves many pending trials (17).
In the system monitor, I indeed see a rapid growth of memory and some processes that occupy RAM but are not actually being run since only two at the time run on GPU.

So basically, I can only run two trials in parallel since my machine will run out of CPU RAM rather than GPU RAM. I would like to decrease the number of pending trials to exploit the GPU at its full potential.

I checked the documentation and found that the environment variable TUNE_MAX_PENDING_TRIALS_PG could be changed to adapt the behavior.
So far, I haven’t succeeded, though, and I would greatly appreciate any help.

E.g., the naive modification at the top of the script, like in the following, doesn’t change anything:

os.environ["TUNE_MAX_PENDING_TRIALS_PG"] = "1"
    result =
        resources_per_trial={"cpu": args.cpus_per_trial, "gpu": args.gpus_per_trial},