@saivivek15 how are you running and submitting your job?
Just for clarification:
sync_config=SyncConfig(syncer=None)
disables any syncing between nodes (and cloud) and will lead to the error that you see.
sync_config = tune.SyncConfig(upload_dir="hdfs://...")
will upload checkpoints to HDFS. In this case you shouldn’t see the error.
sync_config = tune.SyncConfig(upload_dir="hdfs://...")
checkpoint_config = air.CheckpointConfig(num_to_keep=1)
should also not throw any error. It could be a bit slow as syncing will be triggered often, so if num_to_keep=2
is an option, it might have better performance.