I’ve spent 20 minutes digging into RLlib’s source code trying to answer the question: “Which optimizer does the PPO algorithm use by default?” Is it Adam, RMSProp, or anything else?
Besides knowing the answer to this question, I’ll be happy if you could tell me where in the code I should have looked to find that out by myself.