Observation and info out of sync

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Problem
I want to use the info dictionary in a custom policy to help choose an action. However, I noticed that info and obs are out of sync.
I subclassed ray.rllib.examples.policy.random_policy.RandomPolicy from here and overrode method compute_actions(self, obs_batch, *args, info_batch, **kwargs). There I noticed that infos in info_batch were one ahead of obs_batch.
To check this, I also added a callback on_postprocess_trajectory(self, *, worker, episode, agent_id, policy_id, policies, postprocessed_batch, original_batches, **kwargs) in order to have a look at postprocessed_batch. In there, I noticed that infos were in sync with new_obs but one ahead of obs (in accordance with the previous paragraph).
So my question is: is this a bug or a feature? if it’s not a bug, is it possible to use the info object during training?
Thanks in advance for your help.