Replacing Rewards with Examples

ironv · July 9, 2021, 2:43pm

Recently I read this paper Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification by Eysenbach, et al. The blog post provides an excellent description of the paper (and code) which is a lot more mathematical. Is there an rllib implementation for this? Thx.

Topic		Replies	Views
Proper implement of reward scaling in PPO RLlib	0	353	December 17, 2020
About the RLlib category RLlib	2	792	March 5, 2025
[RLlib] Make it easier to play trained policies RLlib	2	798	June 3, 2021
[RLlib] Policy evaluation using CLI and Python API produce different reward distributions RLlib	2	317	August 2, 2023
Ray Tune: Reinforcement Learning-based search algorithm	0	199	November 29, 2023