Example for action_masking_rl_module broken?

Pitcherrr · October 29, 2024, 1:32am

How severe does this issue affect your experience of using Ray?
High: It blocks me to complete my task.

I have recently updated my ray (2.38.0) to migrate towards using the new api. I have been using the action making example to train a PPO agent in a custom environment. When migrating I was getting errors and found that the example for action masking does not run.

To reproduce the error:
From ray/rllib/examples/rl_modules
python action_masking_rl_module.py --enable-new-api-stack --num-env-runners 2

The error I am getting is:

(SingleAgentEnvRunner pid=84919)     module = self.module_class( [repeated 2x across cluster]
(SingleAgentEnvRunner pid=84919) TypeError: ActionMaskingRLModule.__init__() got an unexpected keyword argument 'observation_space' [repeated 2x across cluster]

Pitcherrr · October 29, 2024, 4:05am

It seems that in rl_module.py TypeError is not handled for model_config deprecation. However after adding this:

the example still errors:

  File "<...>/ray/rllib/examples/rl_modules/classes/action_masking_rlm.py", line 131, in _preprocess_batch
    action_mask = batch[Columns.OBS].pop("action_mask")
AttributeError: 'Tensor' object has no attribute 'pop'

I checked the return types and the environment returns OrderedDict’s as its Obs correctly but returns a tensor at some point causing it to crash.

Some notable warnings that I saw:
From Gymnasium:

<...>/gymnasium/envs/registration.py:693: UserWarning: WARN: Overriding environment rllib-single-agent-env-v0 already in registry.

<...>/gymnasium/utils/passive_env_checker.py:275: UserWarning: WARN: The reward returned by `step()` must be a float, int, np.integer or np.floating, actual type: <class 'numpy.ndarray'>

And obviously the ones from RLLib:

WARNING deprecation.py:50 -- DeprecationWarning: `RLModule(config=[RLModuleConfig object])` has been deprecated. Use `RLModule(observation_space=.., action_space=.., inference_only=.., model_config=.., catalog_class=..)` instead. This will raise an error in the future!
WARNING deprecation.py:50 -- DeprecationWarning: `RLModule(config=[RLModuleConfig])` has been deprecated. Use `RLModule(observation_space=.., action_space=.., inference_only=.., learner_only=.., model_config=..)` instead. This will raise an error in the future!

wpm · March 2, 2025, 5:29pm

I had a lot of trouble understanding action_masking_rl.py and wrote my own simpler example custom action masking environment in the Ray Masked project.

Topic		Replies	Views
Examples Just Don't Run Configure Algorithm, Training, Evaluation, Scaling	0	29	December 17, 2024
Training Action Masked PPO - ValueError: all input arrays must have the same shape ok False Configure Algorithm, Training, Evaluation, Scaling	4	78	December 17, 2024
Action masking redux RLlib	7	125	March 5, 2025
ImportError: cannot import name 'ActionMaskingTorchRLModule' from 'ray.rllib.examples.rl_modules.classes.action_masking_rlm RLlib	1	64	August 11, 2024
Simple multi agent setup with action masking problems RLlib	1	264	June 3, 2025

Example for action_masking_rl_module broken?

Related topics