Hi all, I’m trying to add an action mask to an LSTM and attention model that works with the use_lstm
or use_attention
parameters. My basic setup is that I have a wrapper Model that contains a FullyConnected
network, and the contained network is set with use_attention=True
, while the wrapper isn’t set that way.
class GenericModel(TorchModelV2, nn.Module):
def __init__(self, obs_space, action_space, num_outputs, model_config, name, use_attention=False, use_lstm=False, agent_type="generic", **kwargs):
nn.Module.__init__(self)
super(GenericModel, self).__init__(
obs_space, action_space, num_outputs, model_config, name
)
self.real_obs_space = flatten_space(observation_space["real_obs"])
if use_attention:
model_config["use_attention"] = True
if use_lstm:
model_config["use_lstm"] = True
self.agent_type = agent_type
self.fc_embed = FullyConnectedNetwork(
self.real_obs_space, action_space, model_config["fcnet_hiddens"][-1], model_config, name, **kwargs
)
self.view_requirements = self.fc_embed.view_requirements
self.view_requirements[SampleBatch.OBS] = ViewRequirement(shift=0, space=obs_space)
From what I understand, this should wrap a smaller model that uses an attention wrapper. However, it’s completely ignoring the attention-wrapping aspect and just using a FullyConnectedNetwork
, and I’m struggling to figure out why. Does ray only process attention wrappers outside the highest-level model that’s passed into the trainer?
I know I’m taking an approach not recommended by RLlib Models, Preprocessors, and Action Distributions — Ray v2.0.0.dev0, but I literally only need to add an action mask to an otherwise barebones model, so I figured that wrapping an attention model would make the most sense.
Apologies for reposting this thread from the Ray slack, I figured the official forums would be a better location to discuss this.