[Contribution] [Help needed] Implementing easy action masking for distributional and dueling DQN

How severe does this issue affect your experience of using Ray?

  • None: Just asking a question out of curiosity

Hi, I’ve implemented an easy action masking method in compute_q_values for DQN (including dueling and distributional). I made a fork with my commited changes, but I don’t want to have anti-patterns so I’m looking for feedback here. (Am I even in the right place?)

For now I have only implemented the changes in torch and it requires having an additional attribute in the DQN model (optional attribute which shouldn’t affect older code nor people who don’t want to use the feature). The workflow from older versions of ray (i.e. redefining a custom model) is not changed.

Thanks.

Hi @Quoding ,

Sure, this is the right place! RLlib offers action masking. Are you planning to contribute? If so, your changes should not be on master on your fork, but on some feature branch (for example “my_action_masking_contribution”). As soon as it’s there, you will be able to open a Pull Request to the official Ray repo and we can look into it.

Have a look at our own ActionMaskModel!

Cheers