Episode.last_info_for() always returns empty dictionary in custom callback

Hi! I’m trying to compute custom metrics based on values in the info dictionary returned on every step of my custom MultiAgentEnv. The info dictionary that I’m returning looks something like this:

info_dict = {
    "0": {"reward": 500},
    "1": {"reward": 400},
    ...
}

Within the dictionary, “0”, “1”, etc. are my agent IDs.

In my on_episode_step callback, I’m trying to retrieve the info dictionary as follows:

for i in range(4):
    agent_id = str(i)
    info_dict = episode.last_info_for(agent_id)
    if info_dict:
        rewards_dict[agent_id] = rewards_dict.get(agent_id, 0) + info_dict["reward"]

However, info_dict always ends up being empty, and I’m not sure how to go about debugging this. Any help would be greatly appreciated!

Hi @jgonik,

Which version of ray are you using?

There is a recent github issue similar to this:

I’m also using the nightly version, so that Github issue addresses the problem. Thanks!