Invalid action masking for variable sized permutation action

jeremysalwen · May 27, 2021, 10:07pm

If you have an action space which involves permuting a variable sized set of objects (up to 25), is there a way to represent this invalid action mask in RLLib?

With my own (non-RLlib) custom learning code, this is easy to do by just storing the number of objects at each step as an integer, and using a Plackett-Luce distribution to efficiently sample permutations over this variable sized set of objects.

However, I don’t see how I could directly use the RLlib’s support for invalid action masking to do this, since, for starters, the number of possible actions is too big to even enumerate.

Is such a custom action space supported by RLlib with invalid action masking?

Topic		Replies	Views
Action masking for dependent multi discrete space Configure Algorithm, Training, Evaluation, Scaling	0	458	August 3, 2023
Variable-length / Parametric Action Spaces RLlib	1	537	August 31, 2021
Action masking error RLlib	9	1664	February 6, 2023
[RLlib] Impossible actions RLlib	12	4041	May 11, 2022
Custom action space Configure Algorithm, Training, Evaluation, Scaling	4	566	July 31, 2023

Invalid action masking for variable sized permutation action

Related topics