Example of custom tensorflow or pytorch neural network

This question is related with this thread. I have a problem of NN saturation, I have tried modifying the hyper-parameters, algorithms and the data. However, the problem has not been solved. As a last option, I have thought in implementing custom models for the algorithms. I have seen that the documentation page provides a description of the class; however, I have not found any example of implementation of custom models. Where can I find examples of implemented custom NN in RLLIB?

Hi @carlorop ,

you can find an example of a custom model here. Note that often people confuse model with network. In RLlib a model is a reinforcement learning model that trains a loss. This loss is often composed of several values from which more than one comes from a neural network. A model might have several networks or better approximation functions.

In the case of the custom model linked above, the model computes Q-values with a fully connected network (TorchFullyConnectedNetwork). If you want to create your own network you need to implement a class similar to the TorchFullyonnectedNetwork.

Have you tried out using rectified units and/or preprocessing inputs by normalizing?

@Lars_Simon_Zehnder. Thank you for your help, your suggestions have been greatly appreciated. Unfortunately, the problem is not solved yet.

I have taken a look at the example that you have linked, however, I can’t quite figure out out how it works. I have the feeling that maybe my NN is saturated due to the resize and clipping at the last layer. Both in discrete and continuous versions of the environment the actions are constant no matter which state is passed. Could I modify the clipping/resize of the last laayer using custom models.

@carlorop ,

here is an example that shows maybe clearer how you can build your own model. As it inherits from the TF(or Torch)ModelV2 (all models should) it overrides some functions of its super and you have to see which functions should be overridden depending on what your model actually needs.

In my opinion, saturation comes rather from not clipping values. But have you checked from which layer on values are saturated?
My suggestion is you take a look into the DDPG policy definition where also the model gets build and see what parameters could help you to avoid clipping/resizing.