Hello everyone, I’m trying to do incremental training (more iterations) using a previously trained model’s checkpoint. The checkpoint is stored in an s3 bucket, and I would like to restore it without resume
to avoid instant termination as suggested here. When I try to do that, I’m getting the following error: FileNotFoundError: [Errno Path does not exist]
Is it a bug? There is a way to do that?