Information about episode reset

In the tf-agent library, there’s a feature of StepType that
indicates whether the current observation is the first/mid/last observation of an episode.

I think this is a helpful signal for implementing a custom model.
Could you please consider adding this feature? or, am I failing to find the feature that is already implemented in Ray?


If you mean to use this during training then I think you could implement this yourself as follows. I belive the sample batch has a key called “t” that indicates the current step in the episode. You could use this to find first if t==0. The sample batch has a key called “done” you could use this to implement last with done == 1.
Finally you could combine those mid = not first and not last.

If you want to use this during sample collection then I guess the best way might be to add it as part of the observation.

Thank you for your suggestion!

I managed to find a workaround to do so, but
I think this feature can be a great help if it can be served as an official feature.
I will make a feature request for this.