I am working on a bare-metal policy that works fine now. However, I need some clarity regarding the agent’s
I store agent infos in this dictionary. Is the
infos dictionary indeed intended to carry this information?
What do I want? I want to carry agent infos from one timestep to the next, to store it in the
SampleBatch that gets written out to
output in the
What have I tried? Simply outputting in
compute_actions() the infos I want to store does not work (I see only the environment
infos). Adding a
'infos' does not work either (I then do not get even the
infos from the environment).
Any help is appreciated to bring clarity to my question.