After AutoTune is done, result log files are kept on head node, we are planning to copy all the files to where the ray-client is, then kick off tensorboard to inspect the result.
Is there such a utility function already? Or we will need to implement an actor to do the copying?
Hey @HuangLED there is an undocumented
ray rsync-down CLI command you can use (just type
ray rsync-down --help for more info on how to use it).
You can also achieve the same thing via manual
scp. I don’t think using an actor would work since the actor would be scheduled on the head node and not on the client side.
Is there already a recommended API way similar to what rsync-down achieves? The reason is I am integrating this step into piece of code implementation.
If not, guess one may always use os.system().
You can do this programmatically using the ray autoscaler sdk on the client side (
Have a follow-up question.
This sdk.rsync() api requires a cluster_config, do I need to construct it by myself? Searched in the discussion group a bit but couldn’t find an answer. Or there is any handy mechanism that we can just retrieve such config (since at this point we are already connected to the cluster).
This method should be called on the laptop, not on the cluster itself.
The cluster_config should just be the path to the yaml file that you used for
ok. Thanks amog.
what if I used command line to start the cluster? Is the config still being yaml file in this case?
By command line do you mean
ray up? In that case, yes it’s just the path to the yaml file you used for
by saying command line, I used “ray up --head” on the head, then use the corresponding cmd on the non-head machines.
During the process, I am not explicitly pointing to any yaml file. (or at least the whole process is agnostic to me; I am not sure which one is used)
I’ve read the section about yaml file here: Config YAML and CLI Reference — Ray v1.8.0
Though our particular use case here now is not using any cloud-native solution at all. We just manually started a cluster on top of raw machines and keep using this cluster. In this case, does yaml file still apply?
digged into the doc a bit more, and this template is for local mode and fit my use case? ray/example-full.yaml at master · ray-project/ray · GitHub
@HuangLED ohhh got it. Ok if you are manually starting the ray cluster and not using the ray cluster launcher, then the autoscaler sdk won’t be useful here.
I would just use the
subprocess module for example to execute an
rsync command, and add this to your python script. This is what
ray.autoscaler.sdk.rsync is doing underneath the hood anyways.