Passing extra action information to the environment (DQN)

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

My question is reasonable similar to Action space with multiple output? although I think the main difference is that my situation is a little bit more specific and can maybe be solved with workarounds instead of proper multi-discrete action spaces.

During my neural network forward pass I have a mechanism to choose a specific action (first action). Then I return the q-values to Ray and with the argmax (I use DQN) it chooses the second action like normal. I am wondering if there is any way to communicate this first action to my environment.

I noticed that DQN doesn’t support Multi-Discrete spaces but would it be possible with this? And is there somewhere a clear example how I could implement something like this myself to give DQN a very barebone support for multi-discrete spaces (only passing an extra action to the environment) or am I now thinking in the wrong direction?