Entropy of policy network's output

How can I display the entropy of the policy network’s outputs,
relative to the maximum possible entropy in Tensorboard?

CC @rliaw . Can you please help?

@sven1977 is the right person to answer this question!

1 Like

Hey @Lauritowal , you can specify a custom callback to calculate this property and store it for TB displaying.

Like so:

  1. Check out the example script rllib/examples/custom_metrics_and_callbacks.py
  2. Override on_postprocess_trajectory and take a look at the postprocessed_batch arg therein. You should be able to see “ACTION_PROBS” and “ACTION_LOGP” as keys. You could use them to calculate the entropy.
  3. Store that entropy inside e.g. episode.custom_metrics["policy_entropy"].
  4. You should see these values now in TB.
1 Like