How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
I ran ray tune to identify the model with optimal hyperparams, and I have that saved to a file.
I want to read in that checkpointed model, and draw the accuracy and loss plots for training and validation over epochs, for that best model.
I can work out how to use ray tune for HPO and save the best model, and how to read the best model back in, but I’m stuck on the last part.
Is there a simple way (it’s my first time using both ray tune and pytorch) for me to add in ‘make accuracy and loss plots of training’ to the checkpointed model at some point? If it matters, I used pytorch lightning to train a graph-based model with pytorch geometric.
The code:
def train_fn(config):
train_graph_classifier(
model_name="GraphConv",
layer_name="GraphConv",
**config)
analysis = tune.run(
train_fn,
config={
"c_hidden": tune.choice([64, 128,256]),
"dp_rate_linear": tune.choice([0.1,0.3,0.5]), #could change to quniform and give a 3-point tuple range
"num_layers":tune.choice([3,4,5,6]),
"dp_rate":tune.choice([0.1,0.3,0.5,0.7])
},
local_dir='/home/test_predictor/ray_ckpt2', # path for saving checkpoints
metric="val_loss",
mode="min",
num_samples=16,
scheduler=scheduler_asha,
progress_reporter=reporter,
name="test")
#find best model
path = analysis.best_checkpoint + '/' + "ray_ckpt"
model = GraphLevelGNN.load_from_checkpoint(path)
....how do I say 'plot model training and val accuracy and loss per epoch here'
Thanks.