How to add LSTM layer to action masked example with New Api stack

**1. Severity of the issue: **

High: Completely blocks me.

3. What happened
In the Action Mask example, the environment’s observation space must be a Dict containing both observations and action_mask, but when I add an LSTM the model insists on a flattened observation vector. How can I integrate a simple LSTM layer into the Action Mask example without breaking its required Dict structure?

I use this connector to flatten the observation space. Without it, I can’t use the simple LSTM which can be defined in model config.

def _env_to_module(env, spaces=None, device=None):
return [
# Inject previous timestep’s actions and rewards into the sequence
PrevActionsPrevRewards(
multi_agent=False,
n_prev_rewards=4,
n_prev_actions=4,
),
# Flatten dictionary observation space; LSTM requires a continuous vector input
FlattenObservations(multi_agent=False),
]

And set my config.rl_module like this:
.rl_module(
rl_module_spec=RLModuleSpec(
module_class=DefaultPPOTorchRLModule,
model_config={
“fcnet_hiddens”: TRAIN_CONFIG[“fcnet_hiddens”],
“fcnet_activation”: TRAIN_CONFIG[“fcnet_activation”],
# Simple LSTM Setting (1 layer without attention)
“use_lstm”: True,
“lstm_cell_size”: 256,
“max_seq_len”: 100,
“lstm_use_prev_action”: True,
“lstm_use_prev_reward”: False,
“vf_share_layers”: True,
},
),
)

I think it should suffice to strip the dictionary/mask out of the batch before feeding it into the LSTM (e.g. pass the various kinds of super.forward a version of batch where OBS is just the observation, with no dictionary wrapped around it and no mask). _preprocess_batch seems to already do that for you, so you’d just want to add it to forward_train. You don’t want to flatten your observation, since then you’re feeding the mask directly into the LSTM.