How next_state_in works

TianrenWang · June 9, 2025, 11:59am

1. Severity of the issue: (select one)
None: I’m just curious or want clarification.
Low: Annoying but doesn’t hinder my work.
Medium: Significantly affects my productivity but can find a workaround.
High: Completely blocks me.

2. Environment:

Ray version: 2.43.0
Python version: 3.11.7
OS: MacOs 15.2
Cloud/Infrastructure:
Other libs/tools (if relevant):

3. What happened vs. what you expected:

Expected: Similarity between the next_state_in in the batch of forward vs the next_state_in set at the output of forward
Actual: There doesn’t seem to be any correlation between the two next_state_ins.

Can someone give confirmation on how next_state_in works? There is no documentation around it. I am currently using it in my code like this. After setting Columns.NEXT_STATE_IN in the output of the forward function, does it become available in Columns.NEXT_STATE_IN of the batch in the forward of next time step?

def _forward_intermediate(self, batch):
    initialHidden = None
    if "next_state_in" in batch:
        initialHidden = batch["next_state_in"].unsqueeze(0)
    .
    .
    .
@override(TorchRLModule)
def _forward(self, batch, **kwargs):
    currentStateFeatures, initialStateFeatures = self._forward_intermediate(batch)
    policy = self.policy_branch(currentStateFeatures)
    return {
        Columns.ACTION_DIST_INPUTS: policy,
        Columns.NEXT_STATE_IN: initialStateFeatures,
    }

I tried debugging the two NEXT_STATE_INs by printing them, but the values are never the same.

Topic		Replies	Views
[rllib] SampleBatch "state_in_0" dimension shorter than expected RLlib	5	1354	June 4, 2021
Value function of recurrent state models RLlib	6	594	October 7, 2021
Understanding state_batches in compute_actions RLlib	7	1159	August 28, 2021
Question on code: _wrapped_forward RLlib	2	351	November 30, 2021
States of Recurrent models for multiple workers/envs RLlib	1	303	April 14, 2021

How next_state_in works

Related topics