Action Masking without Including "action_mask" in the Observation Space?

VisionZUS29 · October 31, 2024, 8:03pm

Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I had a working action masking model for discrete PPO but was trying to figure out a way to remove the “action_mask” from the observation vector as this could become large and take up a lot of data.

Anyways, I couldnt find a way to access or pass environment variables even if they were in the observation vector to allow the creation of the action_mask vector within the CustomAcitonMaskModel. The idea is to perform the same check I did within the environment to assign the observation “action_mask” but within the model code so that the action_mask vector isnt passed as an observation. The issue is the values within the model are preprocessed and are tensors…

Maybe im thinking about it wrong but an agent doesn’t need to know what actions are available as an observation if the probability of selecting said actions are 0 in the network…

The reason I want this is because with 5 float values I can mask N number of actions based on my env and action space, instead of having to pass N bools of masked actions.

Topic		Replies	Views
Action masking & Dict observation space & 'avail_actions'? Configure Algorithm, Training, Evaluation, Scaling	1	1154	August 4, 2023
Action masking not working Configure Algorithm, Training, Evaluation, Scaling	0	343	August 14, 2023
Problem with action masking RLlib	7	2287	May 19, 2022
Action masking of continuous actions RLlib	2	368	January 13, 2025
How tu use PPO agent with env with masked actions? RLlib	3	1557	May 3, 2022

Action Masking without Including "action_mask" in the Observation Space?

Related topics