I’m currently training an SAC agent (4D continuous action space [-1,1] , 19 D obs space [unbound]) which is able to create successful episodes fairly quickly but then it starts to perform poorly after awhile (See graph bellow)
Left all params as default, that being said I should probably run a tune session for this agent. Besides the params you listed which other should I tune and what should their ranges ?
Have a look at our tuned examples section in the repo to find some examples of what parameters we modified in the past and also to find out a good starting point for a hopefully similar problem.