How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Hello, I am using tune to perform different trials on hyperparameters with RLlib fin a custom environment
My problem is that I was supposed to save a final checkpoint per trial but although I have a folder inside each trial folder with checkpoints they are not the correct files
Specifically I am getting the following folder structure
*Experiment*
-
Trial*
-
Checkpoint_-00001*
-
.is_checkpoint*
-
.null_marker*
-
.tune_metadata*
-
params*
-
progress*
-
results*
Essentially I was hopping to have a trained agent per trial and select the best agent that I could then restore to perform actions on my environment. From my understanding, the checkpoint_at_end=True was supposed to save these checkpoints
Is there another way to load a trained agent apart from checkpoints?
here is my snippet
def experiment(config):
iterations = config.pop("train-iterations")
train_agent = DQNTrainer(config=config)
checkpoint = None
train_results = {}
for i in range(iterations):
train_results = train_agent.train()
tune.report(**train_results)
train_agent.stop()
config["lr"]=tune.grid_search([1e-5, 1e-4])
tuneobject=tune.run(
experiment,
config=config,
local_dir=raylog,
checkpoint_at_end=True,
checkpoint_freq=10,
name='Exp1',
checkpoint_score_attr="episode_reward_mean")
Thank you