Pytorch Tutorial understanding

Chumpington · November 8, 2022, 1:32am

Hi Everyone,

I hope this isn’t too noobie of a question. I am new to AI and pytorch.

I have followed the tutorial here:
How to use Tune with PyTorch — Ray 2.0.1

I am a little confused with one part, maximum epochs.

In the code “train_cifar” we see it has:

for epoch in range(10):

and in the main function it says

def main(num_samples=10, max_num_epochs=10, gpus_per_trial=2):

with the ASHAScheduler using this for max_t=max_num_epochs

My question is, what is the difference between these two epoch counters?
I get the one in train_cifar would be how many epochs we use to train the model, but then what does this max_num_epochs in the main function do? Should they just be the same?

I thought that maybe the one in main might rerun the train_cifar function max_num_epoch number of times until train_cifar converged.

Thank you for your help! I am sure this is quite trival for most of you, maybe one day it will be for me too!

xwjiang2010 · November 8, 2022, 5:00pm

Hi!
max_num_epochs is supplied into ASHAScheduler’s initializer (max_t): Trial Schedulers (tune.schedulers) — Ray 2.1.0

It means: max time units per trial. Trials will be stopped after max_t time units (determined by time_attr) have passed.

So say if you specify this bigger than 10, it probably doesn’t make any difference, as the training function only iterates 10 times (for epoch in range(10)). If you specify this to be smaller than 10, the training function will be run less than 10 times, which probably is not what you want. Be default, this max_t is 100. So feel free to leave as it is if your epoch number is only 10.

Chumpington · November 8, 2022, 7:33pm

Ah thank you for this explanation!

It sounds like they should probably just be equal.

Topic		Replies	Views
Relationship of epochs and training itertions	0	102	April 17, 2024
Early stopping rules for ASHAScheduler Ray Tune	4	813	May 31, 2022
Ray tune iter vs Autogluon epochs Ray Tune	0	233	April 3, 2024
Ray Iteration vs Keras Epoch	4	713	February 24, 2023
Question - About tune stopping condition with PBT	6	502	February 21, 2023

Pytorch Tutorial understanding

Related topics