Training Action Masked PPO - ValueError: all input arrays must have the same shape ok False

Pitcherrr · November 20, 2024, 5:12am

Migrating from old api, I have made changes to the action_masking_rlm.py to make the example work (action_masking_rl_module.py). However, in my own environment setup I am getting the error:

I am not sure what is causing it. I have added shape checking to the action masking RL module to enforce the shapes of observations and action masks, but this never returns an error.

Some insight on where to look or how this process is meant to work would be much appreciated.

jjgriffin2 · December 4, 2024, 8:42pm

Similar issue - runtime objects to having a discrete action space. This is in a PPO optimization with action-masking.

PhilippWillms · December 4, 2024, 8:53pm

Hello, please try to reproduce with latest version of action masking example for new API stack. If it is not working, open issue at GH including your config and module files.

jjgriffin2 · December 17, 2024, 5:15pm

Well, on a fresh install of ray[default], yielding ray2.4.0, try running the following example:

cd ~/anaconda3/lib/python3.12/site-packages/ray/rllib/examples/rl_module
python ./action_masking_rl_module.py --enable-new-api-stack --num-env-runners 2

The error is:
TypeError: ActionMaskingRLModule.init() got an unexpected keyword ‘observation_space’

jjgriffin2 · December 17, 2024, 5:21pm

I have yet to find an example from a fresh install of ray 2.40.0 where action-masking actually works, which is pretty much a show-stopper for me.

For me, this has been a common complaint. I go back to ray 0.8-something. What I routinely find is that examples just don’t work, because the code base has outpaced the examples. I can understand that on daily releases of 3.0.0.dev0, but not for releases like 2.40.0

Topic		Replies	Views
Example for action_masking_rl_module broken? RLlib	2	235	March 2, 2025
Examples Just Don't Run Configure Algorithm, Training, Evaluation, Scaling	0	27	December 17, 2024
Action masking error RLlib	9	1657	February 6, 2023
Action masking for multi-agent DQN RLlib	1	1065	February 23, 2023
Undestanding the expected output shapes of a Recurrent model with Dict Action Space Configure Algorithm, Training, Evaluation, Scaling	2	281	January 15, 2024

Training Action Masked PPO - ValueError: all input arrays must have the same shape ok False

Related topics