I’ve spent 20 minutes digging into RLlib’s source code trying to answer the question: “Which optimizer does the PPO algorithm use by default?” Is it Adam, RMSProp, or anything else?
Besides knowing the answer to this question, I’ll be happy if you could tell me where in the code I should have looked to find that out by myself.
Thanks for your help,
Adam is the default.
Here is the pointers for torch. Tf is similar.
Thank you for the quick answer Manny!