PPO Policy not respecting action-space bounds

RC77RC · June 27, 2024, 12:19pm

How severe does this issue affect your experience of using Ray?

I have defined an environment that has the following action space:

self.action_space=Box(low=np.array([0.01,0.01,0,0.01,0.01,1]),high=np.array([25,5,3,1,1,15]))

Then, I used PPO to train an agent on this environment. During training, the bounds are respected and no problems arise.

However, I now wanted to use the policy that resulted from training. To do so, I have the following code:

agent = Policy.from_checkpoint(path_to_checkpoint)['default_policy']
action = agent.compute_single_action(state)[0]

However, this constantly generates actions that fall outside of the predefined bounds. How can I correct this? I tried to change

agent.config["clip_actions"] = True

or doing

action = agent.compute_single_action(state,clip_actions=True)[0]

but the problem was not solved.

Any help is appreciated.
Thanks!

Topic		Replies	Views
Eval agent computes action outside of Environment Bounds RLlib	1	507	January 12, 2022
Action space boundaries not adhered to by restored agent RLlib	1	295	August 12, 2021
ARS produces actions outside of `action_space` bounds RLlib	9	870	October 18, 2022
Restored Policy gives action that is out of bound Checkpointing, Restoring	1	581	April 13, 2023
Compute_single_action with clip_action and unsquash_action to True get out of range from ActionSpace Configure Algorithm, Training, Evaluation, Scaling	0	15	August 14, 2024