Hello Dear community,
i got a question regarding the ray.rllib.evaluation.sample_batch_builder file and I am not 100% familiar with RLib yet .
I want to use expert knowledge data from another source and want to pretrain an algorithm like the parameter sharing version of the DQN or PPO with this data.
Therefore, i thought about using the Sample Batch Builder Function to use this offline data for pretraining.
Unfortunately, i couldn’t find any example for the usage of the MultiAgentSampleBatchBuilder. The problem is:
How do I define the policy map, which is mandatory, in advance?
Do I have to instantiate a policy and hand it to the function?
How do I instantiate the policy?