So, I have a rollout function that returns the following structure:
Rollout_results(info=infos, states=states, values=values, actions=actions, rewards=rewards, win=win, logps=logps, entropies=entropies, dones=dones, net_info=network_infos)
The majority of this info is useful in downstream calculations (e.g., computing GAE)
However, so that I don’t duplicate work that’s already done here in rllib, I want to switch to using the ‘trainer.evaluate()’ functions instead since that will gracefully handle cases like single-agent and multi-agent under the hood.
Is there a way to get all this info out of the