Question about internal states to the environment

Aceticia · October 4, 2021, 12:15am

I’ve been working on a project that requires passing the internal state of some agents to the environment for finding rewards. I believe I can’t directly send it in the form of action because this will let the gradient flow through the internal states. Is there a way to pass states to the environment? Or is there a way to disable gradient for one part of the action? What would be an alternative to this? Thank you.

mannyv · October 4, 2021, 1:02am

@Aceticia,

I am not sure I followed you about the gradient flowing through the internal states part. There are no gradients during the sample phase of the execution plan only the learning portion.

I do not know a nice clean way to do this in rllib. Maybe someone else will have a better approach.

I would probably try to implement this using a custom callback. The on_episode_step method has access to the worker and the environments. The worker has access to the policies and they in turn have the model.

What I would do is create a custom model that stores whatever values I wanted to communicate as a member variable in the model then in the calback I would take that info from the model and put it on the environment.

Keep in mind that on episode step is called after an action is taken. So you would be using the callback to add info to the environment for step t+1.

Aceticia · October 4, 2021, 5:43am

Got it. This seems complicated though. Just curious, is there any chance that rllib will support some type of fixed-size extra info passed from the agents to the env? It just needs to be handled the same way that actions are handled, but without the learning part.

Topic		Replies	Views
How to pass information from agent to env RLlib	2	293	October 13, 2021
Setting global info state in Multi-Agent step function RLlib	0	223	December 9, 2020
Jump-Start Reinforcement Learning RLlib	33	247	February 12, 2025
Extracting and storing per step agent state from RLlib rollouts RLlib	3	311	July 23, 2021
Query policy from within environment, without logging action? RLlib	4	253	September 27, 2022

Question about internal states to the environment

Related topics