Increasing/decreasing exploration in rllib impala algorithm

Hi community,

I want to know which variables in ray.rllib.algorithms.impala.impala — Ray 2.2.0 I should modify in order to increase/decrease the exploration of the impala algorithm?

Hi @saeid93 ,

Impala uses the StochasticSampling sampling strategy.
Impala uses gaussian distributions that are parameterized by the network itself.
This distribution can be kept more stochastic (with high variance) if there is a high entropy loss imposed. That is done by using a high entropy coefficient. So the parameter you are looking for is entropy_coeff.