Saved ONNX model using DQN dueling policy

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I have trained a DQN dueling agent using RLlib, restored a checkpoint and saved the ONNX model. When attempting to run the model for inference I find an output of size 256 and not the output size of my model. I guess the reason is the post-processing Ray needs to do to support the dueling mechanism.

Therefore, it seems RLlib does not currently have the proper output size implemented for exporting a model to ONNX (or Torchscript) trained with dueling DQN.

My question is, since the final layer weights are stored somewhere - afterall the function compute_single_action(obs) works within Ray - how can I access these weights to build a final ONNX model for inference?

Ran into this same problem using Ray 2.1.0. Any suggestion from the Ray team for how to work around this issue?

Same issue here with ray 2.3.0!

cc: @arturn @gjoliver Any suggestions?

256 is the size of the last model output layer.
RLlib uses a DistributionalModel with this output to compute the final Q values.

The reason we do that is to support distributional q learning.

I would suggest you use RLlib to serve the model. Or do something similar if you want to use the model outside of RLlib.

1 Like