Prediction outside outside action space during inference


I’m running PPO with custom env. The training is running perfectly, I checked it predicted action within the action space.

I save a checkpoint every iteration. But when I load it, the action predicted by the loaded policy using policy.from_checkpoint are ranging from -1 to 1 where the action space should be between 0-30.

is there any postprocessing i’m missing ?

help please