Using ray cluster launcher on GCP with private IP addresses

Hi,

I would like to use the the following command

`ray up -vvv ray/python/ray/autoscaler/gcp/example-full.yaml`

from here to set up ray on GCP dataproc cluster.

When I run the command, the master node gets created, but it doesn’t get past [1/7] Waiting for SSH to become available. When I inspect the ssh uptime command, I see two issues

  1. For security reasons, my employer doesn’t want me to create VMs with public IP addresses.

  2. To interact with an instance, I have to use IAP for TCP forwarding. For example, when I ssh into an instance, I have to use a command like

    gcloud compute ssh dutch@instance-name --tunnel-through-iap

I don’t know of any standard ssh command that replicates this functionality.

I can install ray on dataproc using the manual setup, but are there any workarounds that would enable me to use the cluster launcher script?

Thanks.

cc @thomasdesr do you have any ideas on how to make this work?

Hey @dutch!

What you’re looking for in SSH’s documentation is called ProxyCommand. You can combine it with compute start-iap-tunnel to access instances by adding something to your SSH config (~/.ssh/config) like this:

Host instance-*
    ProxyCommand gcloud compute start-iap-tunnel %h %p --listen-on-stdin --project=<your-project-id> --zone=<your-zone-id>

I’m not sure what the instance name is going to end up being so getting that right might require some tweaking of that code!

Here are some additional links that might be helpful while you’re iterating on that:

Let me know how this goes! I’d be great to document a working path so others can benefit :smiley:

Thanks @thomasdesr! I’ll take a look.