Fetch action probability distribution from trained policy

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

How can I get the action probability distribution from a trained policy for a particular state using Ray/RLLib version 2?

Tried using policy.compute_single_action(state, full_fetch=True), but that only fetches additional information for the selected action.

Thanks,
Stefan

Hi @steff,

Others may have a better way but the best way I know of is to use the action_dist_inputs key in the extra fetches dictionary to create your own action distribution and compute the probabilities from there.

@arturn This is the third request for this I have seen this month in the forumns. Perhaps it makes sense to store all the action probabilities in addition to the selected actions.

Thanks! I’ll open a feature request!

@mannyv and yes, that’s how we do it ourselves. Fetch a fresh action distribution and use the provided inputs!

@steff In the meantime, please have a look at our Policy classes! For example in SAC policies we turn action distribution inputs into distributions that you can sample from.