Multiple availability zones for GCP

It seems that with an AWS provider it’s possible to configure your cluster to launch instances in multiple zones. However with GCP, this seems impossible. It this wanted or is this a bug?

config:

provider:
  type: gcp
  region: us-east1
  availability_zone: us-east1-c,us-east1-b
  project_id: <redacted>

Error from GCP

"Invalid value for field 'zone': 'us-east1-c,us-east1-b'. Must be a match of regex '[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?'". Details: "[{'message': "Invalid value for field 'zone': 'us-east1-c,us-east1-b'. Must be a match of regex '[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?'", 'domain': 'global', 'reason': 'invalid'}]">

I think this is a bug. However, it seems like the validation is being done on the GCP side, not ray – is there a separate field we should be using?

Could you also try linking the full stacktrace?

Here you go! No the field is fine. I looked at your python provider code AWS vs GCP and in AWS, you guys are looping through the availability_zone (split on ,) to get the subnets. For GCP, that logic is not the same.

Traceback (most recent call last):
  File "<redacted>/venv/bin/ray", line 8, in <module>
    sys.exit(main())
  File "<redacted>/venv/lib/python3.8/site-packages/ray/scripts/scripts.py", line 1706, in main
    return cli()
  File "<redacted>/venv/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "<redacted>/venv/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "<redacted>/venv/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "<redacted>/venv/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "<redacted>/venv/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "<redacted>/venv/lib/python3.8/site-packages/ray/scripts/scripts.py", line 868, in up
    create_or_update_cluster(
  File "<redacted>/venv/lib/python3.8/site-packages/ray/autoscaler/_private/commands.py", line 242, in create_or_update_cluster
    get_or_create_head_node(config, config_file, no_restart, restart_only, yes,
  File "<redacted>/venv/lib/python3.8/site-packages/ray/autoscaler/_private/commands.py", line 543, in get_or_create_head_node
    nodes = provider.non_terminated_nodes(head_node_tags)
  File "<redacted>/venv/lib/python3.8/site-packages/ray/autoscaler/_private/gcp/node_provider.py", line 83, in non_terminated_nodes
    response = self.compute.instances().list(
  File "<redacted>/venv/lib/python3.8/site-packages/googleapiclient/_helpers.py", line 134, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "<redacted>/venv/lib/python3.8/site-packages/googleapiclient/http.py", line 935, in execute
    raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://compute.googleapis.com/compute/v1/projects/<redacted>/zones/us-east1-c%2Cus-east1-b/instances?filter=%28%28labels.ray-node-type+%3D+head%29%29+AND+%28%28status+%3D+RUNNING%29+OR+%28status+%3D+PROVISIONING%29+OR+%28status+%3D+STAGING%29%29+AND+%28labels.ray-cluster-name+%3D+default%29&alt=json returned "Invalid value for field 'zone': 'us-east1-c,us-east1-b'. Must be a match of regex '[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?'". Details: "[{'message': "Invalid value for field 'zone': 'us-east1-c,us-east1-b'. Must be a match of regex '[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?'", 'domain': 'global', 'reason': 'invalid'}]">

Hmm, does GCP actually support this out of the box? I know AWS does

GCP doesn’t support it. It’ll be the autoscaler’s job to make a split on the zones .split(",") and create the VMs in the wanted availability zones.

Hmm ok. Could you file a feature request for this then? This seems like a bit more work than expected!

Can you please link feature request issue here. Having trouble finding it in github. Thanks.