WARNING syncer.py:585 -- Last sync command failed: Sync process failed

Hi All,

I am fairly new to ray AIR. I am training a pre-trained model on my local machine by using ray train and ray tune.

Trainer

trainer = TorchTrainer(
    train_loop_per_worker=train_loop_per_worker,
    train_loop_config=train_loop_config,
    scaling_config=scaling_config,
    #run_config=run_config,
    datasets={"train": train_ds_ray}
    #dataset_config=dataset_config,
    #preprocessor=preprocessor,
)

tuner = tune.Tuner(
trainable=trainer,
run_config=run_config
)
results=tuner.fit()  

The run_config is as follows:
run_config = RunConfig(
storage_path=results_fp,
name=experiment_name,
checkpoint_config=checkpoint_config
)

I am getting a warning as is posted in the topic title. This creates a separate directory within the base directory of ray_results on every iteration and throws in the error
2023-08-07 14:45:39,686 WARNING syncer.py:585 – Last sync command failed: Sync process failed: GetFileInfo() yielded path ‘C:// /ray_results/llm-1691390435/TorchTrainer_52116_00000_0_2023-08-07_14-40-38’, which is outside base dir ‘C://\ray_results\llm-1691390435’.

Am I missing some additional config here for syncing on local machine?

@kai could you take a look here? Is there some parsing issue for Windows paths?

Yes this looks like a windows-specific issue. Can you run in WSL as a workaround? I’ll take a stab on fixing this this week.

Hi Team,

Thank you for looking into this. I would like to mention that similar issue has been raised in the past. https://discuss.ray.io/t/ray-tune-copies-checkpoint-to-the-same-location-when-running-locally/11151 . There is a discussion thread and few debugging attempts . Directing it to same, if it any way helps in resolving this windows error.