Disclaimer, first time working with an RL agent !
For my project we are using an SAC agent to drive a vehicle in a simulation. Long story short, I’ve had 0 success getting my agent to reach an optimal state. Furthermore, the agent seems to be performing great in the beginning and then its performance drops drastically later in training. This run used the default HPs for SAC.
Due to a memory leak with the sim that I;m using I can only run the agent for about about 20K steps before having to save the last checkpoint and restore.