Hi!
How can I display the entropy of the policy network’s outputs,
relative to the maximum possible entropy in Tensorboard?
@sven1977 is the right person to answer this question!
1 Like
Hey @Lauritowal , you can specify a custom callback to calculate this property and store it for TB displaying.
Like so:
- Check out the example script rllib/examples/custom_metrics_and_callbacks.py
- Override
on_postprocess_trajectory
and take a look at thepostprocessed_batch
arg therein. You should be able to see “ACTION_PROBS” and “ACTION_LOGP” as keys. You could use them to calculate the entropy. - Store that entropy inside e.g.
episode.custom_metrics["policy_entropy"]
. - You should see these values now in TB.
1 Like