Hi! I want to check the corresponding metric for each of my checkpoint saved when training with RLlib and Ray Tune. I’ve refered to the Ray.train.Result
api and tried to restore from path. However, there is no “checkpoint_dir_name” in the metric_df restored, and thus the result loading failed.
This can be easily reproduced by following code:
from ray.rllib.algorithms.ppo import PPOConfig
from ray import tune, train
from ray.train import RunConfig, CheckpointConfig
tuner = tune.Tuner(
"PPO",
param_space=PPOConfig().environment("CartPole-v1").to_dict(),
run_config=RunConfig(
checkpoint_config=CheckpointConfig(checkpoint_at_end=True, checkpoint_frequency=1),
stop={"training_iteration": 3}
)
)
best_result = tuner.fit().get_best_result()
# Loading
from ray.train import Result
restored_result = Result.from_path(best_result.path)
The script above will raise KeyError, saying there isn’t “checkpoint_dir_name” in the metrics_df