This is the code snippet I am using for converting trajectory sequences to SampleBatch.
rllib_batch_dict = {
"obs": s, "actions": a, "rewards": r, "new_obs": s_prime
}
rllib_batch = SampleBatch(rllib_batch_dict)
The following code snippet to train dqn offline -
ray.init()
config = dqn.DEFAULT_CONFIG.copy()
config["num_gpus"] = 0
config["num_workers"] = 1
# config["input"] = "custom_input"
trainer = dqn.DQNTrainer(config=config)
I am not sure how can I configure SampleBatch object rllib_batch into config.
I can see following code snippet on ray official doc website for configuring json batch type as input, but I am not able to think about my case where i dont have json batch but SampleBatch object.
trainer = DQNTrainer(...)
... # train policy offline
from ray.rllib.offline.json_reader import JsonReader
from ray.rllib.offline.wis_estimator import WeightedImportanceSamplingEstimator
estimator = WeightedImportanceSamplingEstimator(trainer.get_policy(), gamma=0.99)
reader = JsonReader("/path/to/data")
for _ in range(1000):
batch = reader.next()
for episode in batch.split_by_episode():
print(estimator.estimate(episode))