Offline RL passing reward data from .json into environment

kris · September 18, 2023, 12:37pm

@Lars_Simon_Zehnder Yes, thank you. I found a solution in another thread. I wish the docs for converting external experiences to batch format would have been a little bit more explicit that envs are not completely necessary and maybe provided an example like the one above in the thread.

My only question after that would be, how do I specify the offline data for both the input training and evaluation in the config?

config = (
DQNConfig()
.framework(“tf2”)
.offline_data(input_config={
“paths”: [“/root/DRL/reward1/0/train/output-2023-09-10_19-16-56_worker-0_0”],
“format”: “json”,
“input”:‘dataset’,
“explore”:False
},
)
.environment(observation_space = Dict({
‘obs’: Box(low = -10000, high = 100000, shape=(32,), dtype = np.float32)
}),
action_space = Discrete(2) )
.debugging(log_level=“INFO”)
.evaluation(
evaluation_interval=1,
evaluation_duration=10,
evaluation_num_workers=1,
evaluation_duration_unit=“episodes”,
evaluation_config={“paths”: [“/root/DRL/reward1/0/test/output-2023-09-10_19-16-56_worker-0_0”],
“format”: “json”,
“explore”: False,
“input”:‘dataset’},
off_policy_estimation_methods={
“is”: {“type”: ImportanceSampling},
“wis”: {“type”: WeightedImportanceSampling},
}
)
)

I ask this because the off-policy evaluation methods need the evaluation data to be a dataset, but this method provides it as a sampler input.

Topic		Replies	Views
Offline reinforcement learning without environment Offline RL	3	1324	November 29, 2023
Error when using offline data (.json) for validation Offline RL	0	258	October 1, 2023
Trial Name in custom env / on_episode_start RLlib	3	353	October 28, 2021
Offline data with self made dataset RLlib	1	262	June 7, 2023
Change or Generate offline data RLlib	9	673	July 5, 2022

Offline RL passing reward data from .json into environment

Related topics