I want to implement GAIL/AIRL in Ray.
If I use one trainer (e.g. PPOTrainer) for the Generator optimization and another trainer for the Discriminator optimization, I need to define a Trainer sub-class to do a standard SGD optimization for the Discriminator. Is there an existing trainer/class/function for this purpose?
Also, to train the Discriminator, I need to sample expert data batches and generated data batches in parallel, is there a good way to do it?