As per the docs, I implemented the integration of weights and biases to Ray tune by doing the following
from ray.air.integrations.wandb import WandbLoggerCallback
wandb_callback = WandbLoggerCallback(project="Ray Tune Trial Run",log_config=True,save_checkpoints=True)
But when I am checking the artifacts that are saved due to save_checkpoints=True and downloading it, I am unable to load the RL agent
It is not storing the checkpoints. Am I missing something, or I have to manually save the checkpoints?
Hi @Athe-kunal,
Are you using an s3
upload directory? If so, this is due to trial artifacts (these wandb checkpoints) not being uploaded to the cloud. We have recently added artifact syncing here: https://github.com/ray-project/ray/pull/32334.
You can try it out on the latest ray nightly. Note that there are still a few limitations (see the PR description for more details). Let me know if any of these limitations block your usage in any way.
Hi @justinvyu
No, I am not using the s3 upload directory. I am training the model in the local directory and uploading it to the weights and biases. But I am only getting the policy checkpoints in the weights and biases, from which I cannot load the model.
Are you trying to load a RLlib policy? Does this Policy.from_checkpoint
work for you? Saving and Loading your RL Algorithms and Policies — Ray 2.3.0
Thank you @justinvyu for clarifying it, I will follow the GitHub issue closely to resolve this.