Suggestion: adding NAF: Normalized Advantage Function — DQN for Continuous Control Tasks

I test DQN-NAF Normalized Advantage Function algorithm (it’s for continous environments) and I wonder if it’s possible to add it for RLLIB? This algorithm isn’t popular but it’s usefull in some cases.

More details: https://arxiv.org/pdf/1603.00748.pdf
Article: NAF: Normalized Advantage Function — DQN for Continuous Control Tasks
Code: https://github.com/BY571/Normalized-Advantage-Function-NAF-