How to store agent's `infos` in the output?

Hi folks,

I am working on a bare-metal policy that works fine now. However, I need some clarity regarding the agent’s infos from compute_actions():

I store agent infos in this dictionary. Is the infos dictionary indeed intended to carry this information?

What do I want? I want to carry agent infos from one timestep to the next, to store it in the SampleBatch that gets written out to output in the Trainer configuration.

What have I tried? Simply outputting in compute_actions() the infos I want to store does not work (I see only the environment infos). Adding a ViewRequirement for 'infos' does not work either (I then do not get even the infos from the environment).

Any help is appreciated to bring clarity to my question.


1 Like

Hey @Lars_Simon_Zehnder, if you access “infos” within your compute_actions method, you should see it then. E.g., in your custom compute_actions_from_input_dict (or action_sampler_fn or action_distribution_fn):

def compute_actions_from_input_dict(...):
        # Access the `infos` key here so it'll show up here always during
        # action sampling.
        infos = input_dict.get("infos")  # <- RLlib will - after an initial test call - always include the "infos" in the input-dict from here on.
        assert infos is not None
1 Like

So, I did some research, and it turns out that it needs some specific settings in the Policy to make RLlib write out the infos from the agent to the output:

  1. It needs a ViewRequirement with the parameter space provided (e.g. a Box space like the default space in ViewRequirements() and
  2. used_for_training=Trueotherwise the variable does not get collected for theoutput`.

See PR #18111 for an example.

I hope this spares some time for others :slight_smile: