Example for new RLModule API with wandb callbacks

Has anyone already made experience to run PPO with new RLModule API, including checkpointing to Weights & Biases ? Would be great to share experiences!