Question about internal states to the environment

mannyv · October 4, 2021, 1:02am

I am not sure I followed you about the gradient flowing through the internal states part. There are no gradients during the sample phase of the execution plan only the learning portion.

I do not know a nice clean way to do this in rllib. Maybe someone else will have a better approach.

I would probably try to implement this using a custom callback. The on_episode_step method has access to the worker and the environments. The worker has access to the policies and they in turn have the model.

What I would do is create a custom model that stores whatever values I wanted to communicate as a member variable in the model then in the calback I would take that info from the model and put it on the environment.

Keep in mind that on episode step is called after an action is taken. So you would be using the callback to add info to the environment for step t+1.

Topic		Replies	Views
How to pass information from agent to env RLlib	2	293	October 13, 2021
Setting global info state in Multi-Agent step function RLlib	0	223	December 9, 2020
Jump-Start Reinforcement Learning RLlib	33	247	February 12, 2025
Extracting and storing per step agent state from RLlib rollouts RLlib	3	311	July 23, 2021
Query policy from within environment, without logging action? RLlib	4	253	September 27, 2022

Question about internal states to the environment

Related topics