Parameter noise exploration and policy gradient / actor critic

mgerstgrasser · September 27, 2022, 2:32am

How severe does this issue affect your experience of using Ray?

None: Just asking a question out of curiosity

As much a conceptual as a technical question, but can we do parameter noise exploration with any of the policy gradient or actor critic methods? I have a setting where action noise just really doesn’t work well, but I’d also really need stochastic policies so DQN isn’t an option. I note that the original parameter noise paper was using a policy gradient approach, but I can’t find much on it more recently. This question was asked here before but was just pointed to DQN and DDPG. So I’m curious, is this something that can be done in rllib with e.g. PPO or any other policy gradient or actor critic approach?

Topic		Replies	Views
Can we add parameter noise instead of action noise to the Gradient-based algorithms? RLlib	1	251	March 31, 2022
Exploration in PPO and policy gradient algorithms RLlib	1	754	November 21, 2021
How does StochasticSampling work? RLlib	4	982	June 27, 2022
Deploying a learned policy under "explore=False / True" RLlib	9	1442	March 17, 2022
Making the selection of action itself "stochastic" RLlib	12	943	October 3, 2022

Parameter noise exploration and policy gradient / actor critic

Related topics