After AutoTune is done, result log files are kept on head node, we are planning to copy all the files to where the ray-client is, then kick off tensorboard to inspect the result.
Is there such a utility function already? Or we will need to implement an actor to do the copying?
Hey @HuangLED there is an undocumented ray rsync-down CLI command you can use (just type ray rsync-down --help for more info on how to use it).
You can also achieve the same thing via manual scp. I don’t think using an actor would work since the actor would be scheduled on the head node and not on the client side.
Is there already a recommended API way similar to what rsync-down achieves? The reason is I am integrating this step into piece of code implementation.
This sdk.rsync() api requires a cluster_config, do I need to construct it by myself? Searched in the discussion group a bit but couldn’t find an answer. Or there is any handy mechanism that we can just retrieve such config (since at this point we are already connected to the cluster).
Though our particular use case here now is not using any cloud-native solution at all. We just manually started a cluster on top of raw machines and keep using this cluster. In this case, does yaml file still apply?
@HuangLED ohhh got it. Ok if you are manually starting the ray cluster and not using the ray cluster launcher, then the autoscaler sdk won’t be useful here.
I would just use the subprocess module for example to execute an rsync command, and add this to your python script. This is what ray.autoscaler.sdk.rsync is doing underneath the hood anyways.