PPO Policy not respecting action-space bounds

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I have defined an environment that has the following action space:

self.action_space=Box(low=np.array([0.01,0.01,0,0.01,0.01,1]),high=np.array([25,5,3,1,1,15]))

Then, I used PPO to train an agent on this environment. During training, the bounds are respected and no problems arise.

However, I now wanted to use the policy that resulted from training. To do so, I have the following code:

agent = Policy.from_checkpoint(path_to_checkpoint)['default_policy']
action = agent.compute_single_action(state)[0]

However, this constantly generates actions that fall outside of the predefined bounds. How can I correct this? I tried to change

agent.config["clip_actions"] = True

or doing

action = agent.compute_single_action(state,clip_actions=True)[0]

but the problem was not solved.

Any help is appreciated.
Thanks!