Discrete tuple action space for simple Q

jmugan · September 30, 2021, 5:03pm

Hi,

I’m trying to use simple Q with a tuple action space, but it doesn’t seem set up for that. I want to use simple Q because I created a custom model for action masking, and when I use DQN the num_outputs is not set to the number of actions but rather the hidden size (I guess because of the dueling Qs or something).

Any advice? I’ve got this working with PPO, but since I have discrete actions, it seems like I should use Q.

Thanks!
Jonathan

mannyv · September 30, 2021, 6:46pm

Hi @jmugan,

I think what you would want to do here is:

In your forward function store the mask as a member variable in the model. Then do the standard forward with the regular observation portion of the input.
Override get_q_value_distributions in your custom model to use the member variable stored in (1.) to do the masking.

You may need to mask out the “action_scores” too.

jmugan · September 30, 2021, 7:19pm

Cool, thanks! That makes sense, but it gives me the error below. I have a tuple of a discrete action space, which in theory should be fine but it looks like it is lookng for a simple discrete space.

    "Action space {} is not supported for DQN.".format(action_space))
ray.rllib.utils.error.UnsupportedSpaceException: Action space Tuple(Discrete(58), Discrete(58), Discrete(58), Discrete(58), Discrete(58), Discrete(58)) is not supported for DQN.```

mannyv · September 30, 2021, 7:28pm

@jmugan

Sorry I totally misread your question the first time around. It does not support anything other than a discrete action space out of the box.

github.com

sven1977/ray/blob/45217496aeb1adaa56ea0836544a2dd684822f0f/rllib/agents/dqn/simple_q_tf_policy.py#L82-L85

    
      
          if not isinstance(action_space, gym.spaces.Discrete):
              raise UnsupportedSpaceException(
                  "Action space {} is not supported for DQN.".format(action_space))

@sven1977 or @gjoliver might be able to weigh in with changes you could make to get it to work. Or you could use ppo

jmugan · October 14, 2021, 10:02pm

I went back to a simple discrete action space, but it is not letting me override get_q_value_distributions. It is still calling the one in DQNTorchModel even though my custom model has the function. In the debugger, the type is OurModel_as_DQNTorchModel. I’ve never seen that kind of thing before. The way the models are constructed seems to be beyond my complexity horizon. I’ll create a separate thread.

Topic		Replies	Views
Unsupported Action Space Exception (Dict with a DQN) RLlib	3	476	June 24, 2024
Action space Discrete is not supported for DQN RLlib	0	67	September 28, 2024
Does rllib QMIX work with a tuple of 2 actions? RLlib	6	615	April 23, 2021
Right way to use tuple action space RLlib	9	1565	September 24, 2021
Example for action masking (without action embeddings) for tuple action space RLlib	2	679	October 27, 2021

Discrete tuple action space for simple Q

Related topics