Autoscaler file share question

Hi here,
I tried using Autoscaler(v1.2) to launch the cluster in GCP(without the container)
if I want to mount and share my local code directory to head node and all the worker nodes during the scaleup, any appropriate way to do it without the need to transfer a bit amount of data cross the networks? (I understand with file_mounts setting in .yaml setting it will firstly rsync the directory/files to the head node and then rsync to all other worker nodes? ) I tested with 100 small 1MB files in the file share folder, during the cluster creation and scale up, with the targeted 50 worker nodes, it almost took 30 mins (nodes launch is fast, but to allow the node in ready status took long because it seems need to rsync all files to the worker then restart the service). This would not possibly make sense in the big cluster scenario ( with hundreds to 1,000 nodes )

We could possibly consider the nfs share(GCP Filestore) to share the code directory but has the cost concern as well as performance concern since hundreds of worker node need to access the nfs …

Any thought?

cc @Ameer_Haj_Ali can you address this question?

Hi!, @sangcho thanks for keeping me in the loop such fast!
@kurtT, thanks for asking.
Would it be possible for you to use S3/Google storage and pull the files in the setup_commands so they can be pulled in parallel for all workers (Blocking mainly by network speed)?

CC @eoakes @ericl , looks like a good use case to test if we are doing file mounts in ray core?

Hi @Ameer_Haj_Ali , thanks for the suggestion. Yes. That could be the possible solution. But our end user usually want to have the file system for their code access. That means possibly if using Google storage, we need to see if we can plug in the FUSE for our user to mount the GCS object storage and see how it can solve the problem