How to deal with irregular action space?

Hello! I’m quite new to Ray, and meet a difficulty in my task:

I defined an action space such as several Discrete(7) in a tuple. Indeed, I don’t want two of them to be same during training. Please help me about it or notice me some keywords…

For example, the action can be [1, 2, 3, 6], but cannot be [1, 1, 3, 6] since the first two are same.

I suspect action mask cannot do this, or I miss some thing?

Thank you.

Hi @RAY_fresh,

The way I would handle this is to usa action masking and keep track of the masking criteria in the environment. That will be the simplest way to handle that.

Hi dear @mannyv ,

Sounds reasonable and workable, but a bit abstracted to me…

Could you please tell a bit more?

Thank you for your kind and mercy!

Looks like auto-regressive action space