How can I display the entropy of the policy network’s outputs,
relative to the maximum possible entropy in Tensorboard?
CC @rliaw . Can you please help?
@sven1977 is the right person to answer this question!
Hey @Lauritowal , you can specify a custom callback to calculate this property and store it for TB displaying.
- Check out the example script rllib/examples/custom_metrics_and_callbacks.py
on_postprocess_trajectoryand take a look at the
postprocessed_batcharg therein. You should be able to see “ACTION_PROBS” and “ACTION_LOGP” as keys. You could use them to calculate the entropy.
- Store that entropy inside e.g.
- You should see these values now in TB.