How could I train a model (lets say SlateQ) if I don’t have a simulator environment to do online training?
I’d plug the model on my website and I could register observations/actions into a replay buffer, but the reward I need to calculate it in a daily batch. So, I would be training an online model offline. Is that possible?
Thank you in advance!