RAY_ADDRESS is same as address args in ray.init(), but output differently?

How severe does this issue affect your experience of using Ray?

  • Low: It annoys or frustrates me for a moment.

Hi there! so I have kubernetes cluster, and I deploy two pods:
notebook-pods → one that deploy my notebook to cluster
cluster-pods → using KubeRay operator, running the cluster example in the website

Some details

  • The only difference is that I’m not using rayproject:ray image, instead I’m using my own image (let’s call this image A) which already has all python dependencies, including ray[default, data, tune] that I need.
  • in the notebook pods yaml file I already declared env variables RAY_ADDRESS=ray://..svc.cluster.local:10001’

Objective
I open jupyter lab from notebook-pods, then I want to connect to ray cluster, and run the first few lines of data example here
With the following code in jupyter lab

import ray
ray.init(address=ray://<ray-cluster-name>.<mynamespace>.svc.cluster.local:10001')
ds = ray.data.range(10000)
ds.take(5)

It works just fine, but that kinda defeat the purpose of declaring the RAY_ADDRESS in the pod yaml file isn’t?

So I thought I don’t need to run ray.init() and just run ds = ray.data.range(10000), but it breaks

2023-01-27 06:02:18,939	INFO worker.py:1230 -- Using address ray://<ray-cluster-name>.<mynamespace>.svc.cluster.local:10001 set in the environment variable RAY_ADDRESS
.
.
(truncated)
.
.
TypeError: 'str' object cannot be interpreted as an integer

why does it break? despite it seems like to be able to connect to remote ray cluster?