I’m trying to restore the trained policy of a tune experiment to use in an real scenario where the obs space comes from instrumentation and not from a simulated scenarios as I used to training the policy.
During training I saved the algorithm with the policies inside the directory as the documentation suggest. I found without difficulties the path to policy with the two files (.pkl and .json), but when I tried to restore the policy a ValueError appears:
Traceback (most recent call last): File "c:\Users\grhen\Documents\GitHub\EP_RLlib\GENERAL_QMIX_init_evaluation.py", line 6, in <module> policy = Policy.from_checkpoint(checkpoint_path) File "C:\Users\grhen\anaconda3\envs\dwelling_DRL\lib\site-packages\ray\rllib\policy\policy.py", line 345, in from_checkpoint return Policy.from_state(state) File "C:\Users\grhen\anaconda3\envs\dwelling_DRL\lib\site-packages\ray\rllib\policy\policy.py", line 365, in from_state raise ValueError( ValueError: No `policy_spec` key was found in given `state`! Cannot create new Policy.
My code is very simple:
from ray.rllib.policy.policy import Policy checkpoint_path = "C:/Users/grhen/ray_results/ajuste_modelo_general_QMIX_5/QMIX_EPEnv_24cb3_00005_5_gamma=0.9900,lr=0.0100,mixing_embed_dim=32_2023-09-17_09-38-46/checkpoint_001400/policies/default_policy" policy = Policy.from_checkpoint(checkpoint_path) print(policy) print(policy.get_weights())
How can I fix this error? How to do to specify a policy_spec in the algorithm config to be save in the checkpointing proccess? Is it something in my code that I used for training and checkpointing or is it a bug?