Setting config["dueling"]=False still runs Dueling DQN

oroojlooy · August 18, 2021, 4:03pm

When I set config["dueling"]=False, I expect to run the raw DQN in which the network returns the Q-values of each action. But, the network returns one node for the value, and the advantage value of each action. Basically, it seems that it still runs the Dueling DQN algorithms. Here is the name of the layers and their size which I got by calling agent.get_weights():

'_convs.0._model.1.weight' = {ndarray: (16, 4, 8, 8)} 
'_convs.1._model.1.weight' = {ndarray: (32, 16, 4, 4)} 
'_convs.1._model.1.bias' = {ndarray: (32,)} 
'_convs.2._model.0.weight' = {ndarray: (256, 32, 11, 11)} 
'_convs.2._model.0.bias' = {ndarray: (256,)} 
'_value_branch._model.0.weight' = {ndarray: (1, 256)} 
'_value_branch._model.0.bias' = {ndarray: (1,)} 
'advantage_module.dueling_A_0._model.0.weight' = {ndarray: (256, 256)} 
'advantage_module.dueling_A_0._model.0.bias' = {ndarray: (256,)} 
'advantage_module.A._model.0.weight' = {ndarray: (4, 256)} 
'advantage_module.A._model.0.bias' = {ndarray: (4,)}

    from ray.rllib.agents.dqn import DQNTrainer, DEFAULT_CONFIG
    config = DEFAULT_CONFIG.copy()
    config['num_cpus_per_worker'] = 8
    config["double_q"] = False
    config["framework"] = "torch"
    config["dueling"] = False
    config["prioritized_replay"] = False
    agent = DQNTrainer(config, 'BreakoutNoFrameskip-v0')

Ray version, Python version, OS: ray 1.5.1, torch 1.7.0, Cent OS 7.

michaelzhiluo · August 19, 2021, 9:21am

For reference (ray/dqn_torch_model.py at d553d4da6cdacb1516a33d5744904d993e524c43 · ray-project/ray · GitHub), the weights are initialized without regards to dueling's value. However, these weights are not used when dueling=False.

oroojlooy · August 19, 2021, 3:56pm

Thanks for the quick answer!

Topic		Replies	Views
Saved ONNX model using DQN dueling policy RLlib	4	490	March 25, 2023
Trouble reproducing results with DQN RLlib	3	453	April 14, 2023
Customize DQN policy in two-trainer multiagent example RLlib	4	388	September 20, 2022
DQN in RLlib not leading to the same results as Vanilla PyTorch Implementation Configure Algorithm, Training, Evaluation, Scaling	0	342	June 21, 2023
Dueling and Distributional Q Learning on the top of Custom Policy RLlib	1	325	February 7, 2022

Setting config["dueling"]=False still runs Dueling DQN

Related topics