How severe does this issue affect your experience of using Ray?
Medium: I need an explanation for better understanding.
What are the default activation functions used in PPO with discrete action space. Especially, what is the output activation for the policy and what is the output activation of the value network?
Moreover, I would be very happy for a hint where I can find the respective information in the GitHub code!