Post process trajectory with full episode

Coac · March 22, 2022, 3:09pm

How severe does this issue affect your experience of using Ray?

Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Hello,

We would like to post-process trajectory in a multi-agent setting to share reward at the end of each episode.

We first look into on_episode_end callback but it seems that we only have episode: MultiAgentEpisode but not the sample batch. Is there a way to get the sample batch here to be freely to modify any reward at step t?

There is also the on_postprocess_trajectory callback but we dont get the full episode in the sample batch. We are forced to use complete_episode batch mode, which we dont want to.

Thanks

Elena · October 17, 2023, 1:27pm

Hi,

Did you manage to solve the problem? I am trying to use the on_postprocess_trajectory callback to modify both rewards and observations for some agents but it seems that this callback is called after the rewards and observations are used to learn.

thanks.

Topic		Replies	Views
Saving episode trajectories during training RLlib	0	221	July 13, 2023
RLlib Batch Postprocessing has steps from other trajectories RLlib	5	364	April 22, 2024
Different episode segmentations for different agents in multiagent? RLlib	2	278	June 30, 2022
How can i use the end of game reward as every steps reward? RLlib	4	701	November 27, 2021
Backdating rewards with PolicyClient RLlib	2	369	December 25, 2022

Post process trajectory with full episode

Related topics