How to add LSTM layer to action masked example with New Api stack

mizhou0309 · July 10, 2025, 12:18pm

**1. Severity of the issue: **

High: Completely blocks me.

3. What happened
In the Action Mask example, the environment’s observation space must be a Dict containing both observations and action_mask, but when I add an LSTM the model insists on a flattened observation vector. How can I integrate a simple LSTM layer into the Action Mask example without breaking its required Dict structure?

I use this connector to flatten the observation space. Without it, I can’t use the simple LSTM which can be defined in model config.

def _env_to_module(env, spaces=None, device=None):
return [
# Inject previous timestep’s actions and rewards into the sequence
PrevActionsPrevRewards(
multi_agent=False,
n_prev_rewards=4,
n_prev_actions=4,
),
# Flatten dictionary observation space; LSTM requires a continuous vector input
FlattenObservations(multi_agent=False),
]

And set my config.rl_module like this:
.rl_module(
rl_module_spec=RLModuleSpec(
module_class=DefaultPPOTorchRLModule,
model_config={
“fcnet_hiddens”: TRAIN_CONFIG[“fcnet_hiddens”],
“fcnet_activation”: TRAIN_CONFIG[“fcnet_activation”],
# Simple LSTM Setting (1 layer without attention)
“use_lstm”: True,
“lstm_cell_size”: 256,
“max_seq_len”: 100,
“lstm_use_prev_action”: True,
“lstm_use_prev_reward”: False,
“vf_share_layers”: True,
},
),
)

MCW_Lad · July 14, 2025, 10:58pm

I think it should suffice to strip the dictionary/mask out of the batch before feeding it into the LSTM (e.g. pass the various kinds of super.forward a version of batch where OBS is just the observation, with no dictionary wrapped around it and no mask). _preprocess_batch seems to already do that for you, so you’d just want to add it to forward_train. You don’t want to flatten your observation, since then you’re feeding the mask directly into the LSTM.

Topic		Replies	Views
How to use LSTM or Attention Network action masking with nested dict action space? RLlib	0	276	August 24, 2023
Issue creating custom action mask enviorment RLlib	14	2275	October 11, 2023
How to not flatten action mask with Dict observation RLlib	0	479	April 8, 2022
Action masking & Dict observation space & 'avail_actions'? Configure Algorithm, Training, Evaluation, Scaling	1	1147	August 4, 2023
How to flatten space when action masking? RLlib	7	1682	September 1, 2023

How to add LSTM layer to action masked example with New Api stack

Related topics