Action masking redux

  • High: It blocks me to complete my task.

This is an old topic, but I’m addressing it to the Ray team. I’m not the only one who would like some answers.

I’m working on a PhD dissertation that involves RL learning. I have developed a custom multi-agent environment that does not use action-masking, and a wrapper around that that adds action masking. The observation returned to a step is defined by:
env.observation_space = Dict(‘action_mask’: Box(0.0, 1.0, (43,), float32), ‘observations’: Box(0.0, 1.0, (127,), float32))
where the action_mask is calculated by the wrapped environment step() function.

Ray does not, out of the box, support either action_masking or dictionaries in the observation space. However, I had found a daily update (ray.version=‘3.0.0.dev0’ that had a modified RL module for torch action masking. Using that very specific version of ray, the project is successful.

Unfortunately, that version no longer exists. The modified torch RL module no longer exists. And current Ray versions (2.38 or 2.40) do not allow for action-masking or dictionaries in observations. Which means that my setup is not reproducible, since it is not now possible to obtain the specified daily (or even identify the specific daily) that does work.

I have, of course, tried generating custom torch RL modules, spending months in the effort, but no matter what I do, some feature or other of the current releases makes that a futile attempt.

So, the questions:

  1. Why was a previously functioning set of features removed?
  2. When will action-masking in a multi-agent environment be added to the current system?

/s/ A long-term (based on previous projects) but extremely frustrated user

I’ve seen your questions on Slack and am having similar frustrations.

You’ve probably already seen my answer on Slack, but for the sake of anyone else reading this post I have created a project called Ray Masked that contains a very minimal custom environment using action masking that can be trained in the latest version of RLLib (2.43.0)

I built it by working backwards from the action_masking_rl_module.py example, which works for me but I find difficult to understand. My example is easier to modify.

It took me several weeks to write this. There’s no one reason why it was so difficult—action masking is just a very delicate, undocumented feature in RLlib. The Slack channel has a record of my debugging efforts, which may be helpful to others who are stuck.

I have a multi-agent branch of my project in which I’m trying to write a multi-agent version of the same minimal environment. I haven’t succeeded yet.

Thanks for reaching out. I’ll look at it tonight

Jim

I also got something similar to work for single agent. My project is multi agent, and I’ve had no luck with that yet. I’ll be extremely interested in seeing what you come up with.

wpm - I think what you’ve actually done is develop an action masking environment.

What is really need is the custom RL module that can handle the environment. With the current Ray, in my case 2.40, dictionaries are not allowed in the environment observation space, and action masking is completely unhandled.

This url:

points to a GitHub repository written by me that contains an RL Module, a testing gymnasium environment, and a Jupyter notebook that pulls it together.

This is not the last word, by any means. My testing shows that it doesn’t fill the bill in multi-agent work, and attempts to make it do so have thus far failed

You’re right, I only worked on the environment and didn’t do anything to customize the module. For my purposes, I think the ActionMaskingTorchRLModule that comes with Ray will work fine, but that remains to be seen.

Pretty sure you’ll find that the ActionMaskingTorchRLModule won’t work.

That’s what all the fuss is about.