How severe does this issue affect your experience of using Ray?
- Low: It annoys or frustrates me for a moment.
Is there a paper I can refer to understand the Stochastic Sampling-based or SoftQ-based exploration? I am using SoftQ exploration but I need to better understand this exploration technique and also cite it.
Is SoftQ exploration just another word for Boltzman Exploration?