Are there any limitations regarding activation fcts in PPO?

How severe does this issue affect your experience of using Ray?

  • None: Just asking a question out of curiosity

Hey guys,

Are there any limitations regarding activation functions in custom models when used in combination with PPO-algo?
Is it arbitrary for all layers inside the model and limited to linear activation (“None”) for the last layer (logits resp. value-head)?

No limitations, except for the last layer.

1 Like