How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
I am trying to train a multi-agent model with action masking based off of this example. However it seems that in this example, the num_outputs is the same size as the action space, which is not the case in my model, so I am unsure how to proceed.
Some things I have attempted:
- Ignoring the num_outputs parameter when building my neural network and making the final layer instead have the same number of neurons as the action space. Received the following error:
RuntimeError: mat1 and mat2 shapes cannot be multiplied (32x19 and 256x256)
- Tried to emulate the implementation shown here. I was a bit unsure on the avail_actions and how to generate them, and I received the same RuntimeError as the previous attempt.
- Changing num_outputs to be the same size as the action space. I was unable to determine where num_outputs is being set. I also didn’t see any options in the configs that appeared to refer to the num_outputs.
Here is the code for my model (simplified).
class MyModel(TorchModelV2):
def __init__(self, obs_space, act_space, num_outputs, *args, **kwargs):
TorchModelV2.__init__(self, obs_space, act_space, num_outputs, *args, **kwargs)
self.model = nn.Sequential(
(nn.Linear(flatdim(obs_space) - act_space.n,8192)),
(nn.Linear(8192,num_outputs)),
)
def forward(self, input_dict, state, seq_lens):
assert torch.equal(input_dict["obs_flat"][:,-19:], input_dict["obs"]["action_mask"])
model_out = self.model(input_dict["obs_flat"][:,:-19])
return model_out, state
Any help would be greatly appreciated!