Hi @manjrekarom ,
Not sure if this would help but maybe you can combine these concepts to achieve Jump-start RL.
Hello there,
I made a custom environment (with Gym API) and I was able to use Rllib for training agents in this environment. I wrote a rule-based "expert " that does not utilize a neural network to sample its actions. I wish to sample trajectories from this “expert” then warm-start my RL agents using imitation learning based on the trajectories generated from this expert.
To this end, I probably need to build an offline dataset to do imitation learning (specifically using BC and MRWIL algorith…
https://docs.ray.io/en/latest/rllib/rllib-concepts.html#how-to-customize-policies
https://docs.ray.io/en/latest/rllib/rllib-offline.html
Hopefully, I will also start working on implementing JSRL in July 2022. It would be great to hear about your experiences and progress on this implementation in the meantime.