Continuous actions go beyond defined action_space and then nan for multi-agent PPO

The problem is exactly as the title says! I’m using a custom RLlib Multi-agent environment with PPO and I have defined my action space as

def get_action_space(self, agent):
        """ Returns the action space. """
        return gym.spaces.Box(
            low=-7.5, high=2.9, shape=(1,), dtype=np.float32

For example: The action space is supposed to be between -7.5 and +2.9 but an action generated may have a value of 50 or even -1594974.6. I’m still not sure why this is happening or if these two issues are even related. Can anyone give an idea what could be wrong?