[rllib] gpu sampling memory and performance issues

alex_koch · December 18, 2020, 3:05pm

I am using rllib with a large Transformer Model. That’s why I want to use the GPU for sampling.
However, as far as I know rllib only supports one thread per GPU model. This creatures gpu OOM errors when I want to use all of my 24 cpu cores. rllib wants to create 24 GPU models and each of them uses a few GB memory.

Did I overlook an option in rllib or is someone currently working on this problem?

I would imagine that the num_cpus_per_worker option creates that many actors which each drive several environments. And all of these actors receive actions from one central GPU model.

Topic		Replies	Views
Training and inference ONLY using GPUs and no CPUs RLlib	7	1868	April 12, 2021
Utilize multiple GPUs more efficiently RLlib	2	678	February 26, 2021
Hardware requirements and setup for running performant APE-X RLlib	1	253	March 23, 2023
Total Workers == (Number of GPUS) - 1? Configure Algorithm, Training, Evaluation, Scaling	1	1184	February 9, 2023
Rllib workers ignoring GPU restrictions RLlib	2	654	December 22, 2020

[rllib] gpu sampling memory and performance issues

Related topics