Memory issue debugging

Hi @Blubberblub,

My guess is that your environment is generating samples much faster than the training is consuming them. This is causing the sample queue to fill up in the policy server which is causing the memory issue.

Try adding the following print and see if this is growing:

    @override(InputReader)
    def next(self):
        print(f"Size of samples queue is: {self.samples_queue.qsize()}", )
        return self.samples_queue.get()

My env is not much faster than the training but if I artificially slow training down by putting a break point in the training call of the policy and waiting 20 seconds I see something like this:

Size of samples queue is: 6
Size of samples queue is: 5
Size of samples queue is: 5
Size of samples queue is: 5
Size of samples queue is: 4
Size of samples queue is: 3
Size of samples queue is: 4
Size of samples queue is: 3
Size of samples queue is: 2
Size of samples queue is: 1
Size of samples queue is: 1
Size of samples queue is: 0
Size of samples queue is: 0
Size of samples queue is: 0
Size of samples queue is: 0
1 Like