@mannyv
I get the gist of what you are saying. I understand the difference in mean, min, max.
So why can’t I plot every episode reward for every completed episode that occurred throughout the training? If it’s already being collected and sorted per train_batch_size?