Episode.last_info_for() always returns empty dictionary in custom callback

jgonik · June 29, 2021, 11:53pm

Hi! I’m trying to compute custom metrics based on values in the info dictionary returned on every step of my custom MultiAgentEnv. The info dictionary that I’m returning looks something like this:

info_dict = {
    "0": {"reward": 500},
    "1": {"reward": 400},
    ...
}

Within the dictionary, “0”, “1”, etc. are my agent IDs.

In my on_episode_step callback, I’m trying to retrieve the info dictionary as follows:

for i in range(4):
    agent_id = str(i)
    info_dict = episode.last_info_for(agent_id)
    if info_dict:
        rewards_dict[agent_id] = rewards_dict.get(agent_id, 0) + info_dict["reward"]

However, info_dict always ends up being empty, and I’m not sure how to go about debugging this. Any help would be greatly appreciated!

mannyv · June 30, 2021, 3:24am

Hi @jgonik,

Which version of ray are you using?

There is a recent github issue similar to this:

github.com/ray-project/ray

[rllib] Callback on_episode_end with MultiAgentEnv not returning empty dict for last_info

opened 08:11PM - 25 Jun 21 UTC

akelloway

P2 bug

### What is the problem? With a MultiAgentEnv and for a on_episode_end custom c…allback (at least, I have not tested other callbacks or Envs) I am expecting to see the last info dictionary for each agent but I am seeing only empty dictionaries for each agent. I think I have tracked down the issue to [this line](https://github.com/ray-project/ray/blob/9b17c35bee6a28b4b09ede36fdb9fdced3929f55/rllib/env/base_env.py#L473) which I think should be: `infos = self.last_infos` but I am not 100% sure of that. *Ray version and other system information (Python version, TensorFlow version, OS):* Python 3.8.3, Windows, TF 2.5, Numpy 1.19.5 ### Reproduction (REQUIRED) Simple reproducing script where I edited the BasicMultiAgent from examples to add content to the info dictionaries and a statement to print what we expect to see in the on_episode_end callback for validation. ```python import gym from ray.rllib.env.multi_agent_env import MultiAgentEnv from ray.rllib.examples.env.mock_env import MockEnv import ray from ray import tune from ray.tune.registry import register_env from typing import Dict from ray.rllib.env import BaseEnv from ray.rllib.policy import Policy from ray.rllib.evaluation import MultiAgentEpisode, RolloutWorker from ray.rllib.agents.callbacks import DefaultCallbacks class MyCallback(DefaultCallbacks): def on_episode_end(self, worker: RolloutWorker, base_env: BaseEnv, policies: Dict[str, Policy], episode: MultiAgentEpisode, **kwargs): print("INSIDE CALLBACK: ", episode._agent_to_last_info) class MyBasic(MultiAgentEnv): def __init__(self, num): self.agents = [MockEnv(25) for _ in range(num)] self.dones = set() self.observation_space = gym.spaces.Discrete(2) self.action_space = gym.spaces.Discrete(2) self.resetted = False def reset(self): self.resetted = True self.dones = set() return {i: a.reset() for i, a in enumerate(self.agents)} def step(self, action_dict): obs, rew, done, info = {}, {}, {}, {} for i, action in action_dict.items(): obs[i], rew[i], done[i], info[i] = self.agents[i].step(action) info[i] = {"MY TESTER" : 1.0} if done[i]: self.dones.add(i) done["__all__"] = len(self.dones) == len(self.agents) if done["__all__"]: print("INSIDE ENV: ", info) return obs, rew, done, info register_env('basic', lambda env_config: MyBasic(4)) config = { "env": 'basic', 'callbacks': MyCallback } ray.init(address='auto', _redis_password='5241590000000000') results = tune.run("PPO", config=config, stop = {"training_iteration" : 2}) ``` - [x] I have verified my script runs in a clean environment and reproduces the issue. - [x] I have verified the issue also occurs with the [latest wheels](https://docs.ray.io/en/master/installation.html).

jgonik · June 30, 2021, 5:19pm

I’m also using the nightly version, so that Github issue addresses the problem. Thanks!

Topic		Replies	Views
Callback methods available? RLlib	0	200	January 19, 2024
EpisodeV2' object has no attribute 'last_info_for' RLlib	1	555	August 18, 2023
How to store env info as a SampleBatch? RLlib	7	390	June 22, 2023
How should you end a MultiAgentEnv episode? RLlib	16	1308	October 1, 2022
Is callback methods deprecated? RLlib	1	297	May 24, 2023

Episode.last_info_for() always returns empty dictionary in custom callback

Related topics