The problem is exactly as the title says! I’m using a custom RLlib Multi-agent environment with PPO and I have defined my action space as
def get_action_space(self, agent): """ Returns the action space. """ return gym.spaces.Box( low=-7.5, high=2.9, shape=(1,), dtype=np.float32 )
For example: The action space is supposed to be between -7.5 and +2.9 but an action generated may have a value of 50 or even -1594974.6. I’m still not sure why this is happening or if these two issues are even related. Can anyone give an idea what could be wrong?