I looked into the rllib libarary, but I had difficulty finding the file responsible for writing training results to the tensorboard. Also, where exactly min, max, and mean of the custom metrics I add via callbacks are being computed? Is it min, max and mean over the episodes or per episode? Furthermore, how the steps in the tensorboard are being computing? In other words, what is the x-axis of each graph means in the tensorboard console? Is it the batch size timesteps or is it the mean of the result of all episodes in the same timestep?
Tensorboard is being created in
@sven1977 can tell you more about min/max/mean.
Thank you for your answer!
for the min/max/mean, I mean the final results that are shown in the tensorboard graph e.g. for the custom metric I have named
num_consolidated I have only added the
num_consolidated to my callback not it’s min/max/mean but in the tensorboard result I see three values for
There is the same case for rllib built-in tensorboard stats e.g. reward that it shows a min/max/mean graph for each of the stats. how these min/max/mean is computed? Is it min/max/mean of all episodes in a training batch? And also where in the rllib source code they are being computed?
My other question was what does the x-axis in the tensorboard graphs represents? Is it the number of steps in a batch? or steps in a sample episode?
For example, in the graphs I sent you, what does step 16.8k in the x-axis means in RL terminology?