[Out-of-Disk/ASHA Scheduler] Delete the stopped checkpoints in the ASHA scheduler

Hi,

I’m using the ASHA scheduler for parameter selection. However, it takes about 250G to store the checkpoints. Is it possible to delete the checkpoints for the stopped trials (i.e., trials not promoted to the next rung).

The code is like this:

scheduler = ASHAScheduler(metric=f'val_{config.val_metric}',
                          mode=args.mode,
                          max_t=100,
                          grace_period=10
                          reduction_factor=3
                          brackets=1)

Is there any suggestion to clear up some space in the middle of the process?
Thanks!

Hi, generally you can use the keep_checkpoints_num to specify the number of checkpoints each trial uses. There is currently no way to delete all checkpoints from pre-emptively stopped trials, but that sounds like a good feature to add.

cc @Yard1 and @xwjiang2010 maybe add this onto the backlog?

1 Like

@kai I see. Thanks for help. :grin: