Hey everyone,
I have an question/issue regarding the PPO more precisely about the multi GPU optimizer it uses.
The problem that I’m having is that since I have a rather large observation and large model (graph convolutional NNs, obviously graphs (which can differ in size, so I also have an overhead from padding with zeros)) my GPU memory gets filled quiet rapidly since all the data is pinned to the GPU memory (as intended by the “multi GPU optimizer”). Now my question is, is there an easy way to change this behavior (i.e. stream the data from RAM to GPU memory) and if not, maybe there should be. As far as I could find in the documentation there is something like that for A2C:
microbatch_size – A2C supports microbatching, in which we accumulate gradients over batch of this Preformatted textsize until the train batch size is reached. This allows training with batch sizes much larger than can fit in GPU memory. To enable, set this to a value less than the train batch size.