Saved ONNX model using DQN dueling policy

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I have trained a DQN dueling agent using RLlib, restored a checkpoint and saved the ONNX model. When attempting to run the model for inference I find an output of size 256 and not the output size of my model. I guess the reason is the post-processing Ray needs to do to support the dueling mechanism.

Therefore, it seems RLlib does not currently have the proper output size implemented for exporting a model to ONNX (or Torchscript) trained with dueling DQN.

My question is, since the final layer weights are stored somewhere - afterall the function compute_single_action(obs) works within Ray - how can I access these weights to build a final ONNX model for inference?