Hi Everyone,
I hope this isn’t too noobie of a question. I am new to AI and pytorch.
I have followed the tutorial here:
How to use Tune with PyTorch — Ray 2.0.1
I am a little confused with one part, maximum epochs.
In the code “train_cifar” we see it has:
for epoch in range(10):
and in the main function it says
def main(num_samples=10, max_num_epochs=10, gpus_per_trial=2):
with the ASHAScheduler using this for max_t=max_num_epochs
My question is, what is the difference between these two epoch counters?
I get the one in train_cifar would be how many epochs we use to train the model, but then what does this max_num_epochs in the main function do? Should they just be the same?
I thought that maybe the one in main might rerun the train_cifar function max_num_epoch number of times until train_cifar converged.
Thank you for your help! I am sure this is quite trival for most of you, maybe one day it will be for me too!