Ray and python versions

ray1 · July 21, 2022, 10:34am

Hi Ray Team,

Is there restrictions or limitations of running ray >=1.10 with python 3.6.9?
for some reason in ray version above 1.8 I’m getting the following error: (with 1.6 it’s work fine)

2022-07-21 08:54:30,177 INFO server.py:842 – Starting Ray Client server on 0.0.0.0:10001
2022-07-21 10:15:17,379 INFO proxier.py:670 – New data connection from client ace70b4b753146babdea12418dbbb528:
2022-07-21 10:15:18,403 INFO proxier.py:341 – SpecificServer started on port: 23000 with PID: 415 for client: ace70b4b753146babdea12418dbbb528
2022-07-21 10:15:48,405 ERROR proxier.py:379 – Timeout waiting for channel for ace70b4b753146babdea12418dbbb528
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/ray/util/client/server/proxier.py”, line 375, in get_channel
timeout=CHECK_CHANNEL_TIMEOUT_S
File “/usr/local/lib/python3.6/dist-packages/grpc/_utilities.py”, line 140, in result
self._block(timeout)
File “/usr/local/lib/python3.6/dist-packages/grpc/_utilities.py”, line 86, in _block
raise grpc.FutureTimeoutError()
grpc.FutureTimeoutError
2022-07-21 10:15:48,405 ERROR proxier.py:379 – Timeout waiting for channel for ace70b4b753146babdea12418dbbb528
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/ray/util/client/server/proxier.py”, line 375, in get_channel
timeout=CHECK_CHANNEL_TIMEOUT_S
File “/usr/local/lib/python3.6/dist-packages/grpc/_utilities.py”, line 140, in result
self._block(timeout)
File “/usr/local/lib/python3.6/dist-packages/grpc/_utilities.py”, line 86, in _block
raise grpc.FutureTimeoutError()
grpc.FutureTimeoutError
2022-07-21 10:15:48,406 ERROR proxier.py:692 – Channel not found for ace70b4b753146babdea12418dbbb528
2022-07-21 10:15:48,406 WARNING proxier.py:777 – Retrying Logstream connection. 1 attempts failed.

python version 3.6.9
ray version > 1.10

Thanks

Chen_Shen · July 22, 2022, 8:10am

hi @ray1, welcome to the community!

I suspect this is to do with the python grpc library version you are using. would you mind sharing your pip freeze result?

On another note, can you try ray==1.13.0 to see if this problem still exists?

ray1 · July 26, 2022, 5:44pm

Hi @Chen_Shen, Thanks for replying so quickly
output form pip freeze | grep grpc , both head\worker and job is:
grpcio==1.39.0
grpcio-tools==1.39.0
unfortunately, 1.13.0 not reolving my issue.
double checked that all my pods is with ray==1.13.0

I’m in k8s deployment

Chen_Shen · July 26, 2022, 6:35pm

ah, I realized it’s a ray client/server connection issue. Just checking you are using the same python/ray version on ray client/server?

ray1 · July 31, 2022, 10:13am

yes both pods (client\server) are with the same ray & python version. I’m using my one docker and with ray-1.13.0-py36 wheel install on it

jjyao · August 10, 2022, 11:33pm

@ray1

Do you have a repro script that reproduce this error?

ckw017 · August 10, 2022, 11:37pm

Can you also share the rest of the contents of pip freeze if possible

ognf · February 2, 2023, 5:13pm

Ran into this issue for my application. For context, I’m hosting a Ray Cluster on AWS EC2s in a VPC. The instances are only accessible through a jump host, so I have a user-defined SSH proxy command in my cluster config file. Additionally, the AWS environment traffic all goes through a proxy. EC2 instances are configured with proxy info when they’re launched, and, since I’m using Ray in Docker, the node Docker containers have proxy info configured through environment variables with Docker run options. My test job is just training a PPO agent with RLlib using a dummy environment defined in my script.

Similar to @ray1, I get gRPC timeout and Ray client/server errors when I ray attach $config -p 10001 and use ray.init("ray://localhost:10001") in my test job script, but I’m using Python 3.8 and Ray 2.2. I can ray rsync_up my test job script and run it from the head node and everything works as expected. My pip freeze contents are below (shouldn’t be anything too wild since it’s just the rayproject/ray-ml:latest-py38-cpu Docker image requirements):

absl-py==1.3.0
accelerate==0.5.1
adal==1.2.7
aiohttp==3.8.3
aiohttp-cors==0.7.0
aiorwlock==1.3.0
aiosignal==1.3.1
ale-py==0.7.5
alembic==1.4.1
anyio==3.6.2
applicationinsights==0.11.10
argcomplete==1.12.3
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
asttokens==2.2.1
astunparse==1.6.3
async-timeout==4.0.2
attrs==22.1.0
autocfg==0.0.8
autogluon.common==0.6.0
autogluon.core==0.6.0
autograd==1.5
autopage==0.5.1
AutoROM==0.4.2
AutoROM.accept-rom-license==0.4.2
ax-platform==0.2.4
azure-cli-core==2.40.0
azure-cli-telemetry==1.0.8
azure-common==1.1.28
azure-core==1.26.1
azure-identity==1.10.0
azure-mgmt-compute==23.1.0
azure-mgmt-core==1.3.2
azure-mgmt-network==19.0.0
azure-mgmt-resource==20.0.0
backcall==0.2.0
backoff==1.10.0
bayesian-optimization==1.2.0
bcrypt==4.0.1
beautifulsoup4==4.11.1
bleach==5.0.1
blessed==1.19.1
boto3==1.4.8
botocore==1.8.50
botorch==0.6.2
brotlipy==0.7.0
cachetools==5.2.0
catboost==1.1.1
certifi @ file:///croot/certifi_1665076670883/work/certifi
cffi @ file:///tmp/abs_98z5h56wf8/croots/recipe/cffi_1659598650955/work
charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
chex==0.1.5
click==8.1.3
cliff==4.1.0
cloudpickle==2.2.0
cma==3.2.2
cmaes==0.9.0
cmd2==2.4.2
colorama @ file:///tmp/build/80754af9/colorama_1607707115595/work
coloredlogs==15.0.1
colorful==0.5.5
colorlog==6.7.0
comet-ml==3.31.9
comm==0.1.1
commonmark==0.9.1
conda==22.11.1
conda-content-trust @ file:///tmp/build/80754af9/conda-content-trust_1617045594566/work
conda-package-handling @ file:///croot/conda-package-handling_1666940373510/work
configobj==5.0.6
ConfigSpace==0.4.18
contourpy==1.0.6
coolname==1.1.0
cryptography @ file:///croot/cryptography_1665612644927/work
cycler==0.11.0
Cython==0.29.26
dask==2021.11.2
databricks-cli==0.17.3
datasets==2.0.0
debugpy==1.6.4
decorator==5.1.1
decord==0.6.0
defusedxml==0.7.1
dill==0.3.6
distlib==0.3.6
distributed==2021.11.2
dm-tree==0.1.7
docker==6.0.1
docker-pycreds==0.4.0
docstring-parser==0.15
docutils==0.19
dopamine-rl==4.0.6
dragonfly-opt==0.1.6
dulwich==0.20.50
entrypoints==0.4
everett==3.1.0
exceptiongroup==1.0.4
executing==1.2.0
fastapi==0.88.0
fastjsonschema==2.16.2
filelock==3.8.2
FLAML==0.9.7
Flask==2.2.2
flatbuffers==1.12
flax==0.6.2
fonttools==4.38.0
freezegun==1.1.0
frozenlist==1.3.3
fsspec==2022.11.0
future==0.18.2
gast==0.4.0
gin-config==0.5.0
gitdb==4.0.10
GitPython==3.1.29
gluoncv==0.10.1.post0
google-api-core==2.11.0
google-api-python-client==1.7.8
google-auth==2.15.0
google-auth-httplib2==0.1.0
google-auth-oauthlib==0.4.6
google-oauth==1.0.1
google-pasta==0.2.0
googleapis-common-protos==1.57.0
gpustat==1.0.0
GPy==1.10.0
gpytorch==1.9.0
graphviz==0.8.4
greenlet==2.0.1
grpcio==1.51.1
gunicorn==20.1.0
gym==0.24.0
gym-notices==0.0.8
h11==0.14.0
h5py==3.7.0
HeapDict==1.0.1
HEBO==0.3.2
higher==0.2.1
hpbandster==0.7.4
httplib2==0.21.0
huggingface-hub==0.11.1
humanfriendly==10.0
hyperopt==0.2.5
idna @ file:///tmp/build/80754af9/idna_1637925883363/work
imageio==2.22.4
imageio-ffmpeg==0.4.5
importlib-metadata==5.1.0
importlib-resources==5.10.1
iniconfig==1.1.1
ipykernel==6.19.0
ipython==8.7.0
ipython-genutils==0.2.0
ipywidgets==8.0.3
isodate==0.6.1
itsdangerous==2.1.2
jax==0.3.25
jaxlib==0.3.25
jedi==0.18.2
Jinja2==3.1.2
jmespath==0.10.0
joblib==1.2.0
jsonschema==4.17.3
jupyter==1.0.0
jupyter-console==6.4.4
jupyter-events==0.5.0
jupyter_client==7.4.8
jupyter_core==5.1.0
jupyter_server==2.0.0
jupyter_server_terminals==0.4.2
jupyterlab-pygments==0.2.2
jupyterlab-widgets==3.0.4
kaggle-environments==1.7.11
keras==2.9.0
Keras-Preprocessing==1.1.2
kiwisolver==1.4.4
knack==0.10.1
kubernetes==25.3.0
libclang==14.0.6
lightgbm==3.2.1
lightgbm-ray==0.1.5
lightning-bolts==0.4.0
linear-operator==0.3.0
locket==1.0.0
lz4==4.0.2
Mako==1.2.4
Markdown==3.4.1
MarkupSafe==2.1.1
matplotlib==3.6.2
matplotlib-inline==0.1.6
mistune==2.0.4
mlagents-envs==0.28.0
mlflow==1.21.0
modin==0.12.1
mosaicml==0.10.1
mpmath==1.2.1
msal==1.18.0b1
msal-extensions==1.0.0
msgpack==1.0.4
msrest==0.7.1
msrestazure==0.6.4
multidict==6.0.3
multipledispatch==0.6.0
multiprocess==0.70.14
mxnet==1.8.0.post0
nbclassic==0.4.8
nbclient==0.7.2
nbconvert==7.2.6
nbformat==5.7.0
nest-asyncio==1.5.6
netifaces==0.11.0
networkx==2.8.8
nevergrad==0.4.3.post7
notebook==6.5.2
notebook_shim==0.2.2
numpy==1.23.5
nvidia-ml-py==11.495.46
oauthlib==3.2.2
onnx==1.12.0
onnxruntime==1.12.0
open-spiel==1.2
opencensus==0.11.0
opencensus-context==0.1.3
opencv-python==3.4.18.65
opentelemetry-api==1.1.0
opentelemetry-exporter-otlp==1.1.0
opentelemetry-exporter-otlp-proto-grpc==1.1.0
opentelemetry-proto==1.1.0
opentelemetry-sdk==1.1.0
opentelemetry-semantic-conventions==0.20b0
opt-einsum==3.3.0
optax==0.1.4
optuna==2.10.0
packaging==21.3
pandas==1.5.2
pandocfilters==1.5.0
paramiko==2.12.0
paramz==0.9.5
parso==0.8.3
partd==1.3.0
pathtools==0.1.2
patsy==0.5.3
pbr==5.11.0
PettingZoo==1.15.0
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.3.0
pkginfo==1.9.2
pkgutil_resolve_name==1.3.10
platformdirs==2.6.0
plotly==5.11.0
pluggy @ file:///tmp/build/80754af9/pluggy_1648042571233/work
portalocker==2.6.0
prettytable==3.5.0
prometheus-client==0.13.1
prometheus-flask-exporter==0.21.0
promise==2.3
prompt-toolkit==3.0.36
protobuf==3.20.3
psutil==5.9.4
ptyprocess==0.7.0
pure-eval==0.2.2
py-cpuinfo==8.0.0
py-spy==0.3.14
pyaml==21.10.1
pyarrow==10.0.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pybullet==3.2.0
pycosat @ file:///croot/pycosat_1666805502580/work
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
pydantic==1.10.2
pyDeprecate==0.3.1
pygame==2.1.2
pyglet==1.5.15
Pygments==2.13.0
PyJWT==2.6.0
pymoo==0.5.0
pymunk==6.0.0
PyNaCl==1.5.0
pyOpenSSL @ file:///opt/conda/conda-bld/pyopenssl_1643788558760/work
pyparsing==3.0.9
pyperclip==1.8.2
pypng==0.20220715.0
Pyro4==4.82
pyrsistent==0.19.2
PySocks @ file:///tmp/build/80754af9/pysocks_1605305779399/work
pytest==7.2.0
pytest-remotedata==0.3.2
python-dateutil==2.8.2
python-editor==1.0.4
python-json-logger==2.0.4
pytorch-lightning==1.5.10
pytorch-ranger==0.1.1
pytz==2022.6
PyWavelets==1.4.1
PyYAML==6.0
pyzmq==24.0.1
qtconsole==5.4.0
QtPy==2.3.0
querystring-parser==1.2.4
ray @ file:///home/ray/ray-2.2.0-cp38-cp38-manylinux2014_x86_64.whl
ray-cpp==2.2.0
ray-lightning==0.2.0
recsim==0.2.4
redis==3.5.3
regex==2022.10.31
requests==2.28.1
requests-oauthlib==1.3.1
requests-toolbelt==0.10.1
responses==0.18.0
rich==12.6.0
rsa==4.9
ruamel.yaml @ file:///croot/ruamel.yaml_1666304550667/work
ruamel.yaml.clib @ file:///croot/ruamel.yaml.clib_1666302247304/work
s3transfer==0.1.13
scikit-image==0.19.3
scikit-learn==1.1.3
scikit-optimize==0.9.0
scipy==1.9.3
semantic-version==2.10.0
Send2Trash==1.8.0
sentencepiece==0.1.96
sentry-sdk==1.11.1
serpent==1.41
setproctitle==1.3.2
shortuuid==1.0.1
sigopt==7.5.0
six==1.13.0
smart-open==6.2.0
smmap==5.0.0
sniffio==1.3.0
sortedcontainers==2.4.0
soupsieve==2.3.2.post1
SQLAlchemy==1.4.44
sqlparse==0.4.3
stack-data==0.6.2
starlette==0.22.0
statsmodels==0.13.5
stevedore==4.1.1
SuperSuit==3.3.3
sympy==1.11.1
tabulate==0.9.0
tblib==1.7.0
tenacity==8.1.0
tensorboard==2.9.1
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorboardX==2.5.1
tensorflow==2.9.0
tensorflow-estimator==2.9.0
tensorflow-io-gcs-filesystem==0.28.0
tensorflow-probability==0.17.0
tensorstore==0.1.28
termcolor==2.1.1
terminado==0.17.1
tf-slim==1.1.0
tf2onnx==1.12.1
threadpoolctl==3.1.0
tifffile==2022.10.10
timm==0.4.5
tinycss2==1.2.1
tokenizers==0.12.1
tomli==2.0.1
toolz @ file:///croot/toolz_1667464077321/work
torch==1.12.1+cu116
torch-geometric==2.0.4
torch-optimizer==0.3.0
torch-scatter==2.0.9
torch-sparse==0.6.15+pt112cu116
torchmetrics==0.7.3
torchvision==0.13.1+cu116
tornado==6.2
tqdm @ file:///opt/conda/conda-bld/tqdm_1647339053476/work
traitlets==5.6.0
transformers==4.19.1
tune-sklearn==0.4.4
typeguard==2.13.3
typer==0.7.0
typing_extensions==4.4.0
uritemplate==3.0.1
urllib3==1.26.13
uvicorn==0.20.0
virtualenv==20.17.1
wandb==0.13.4
wcwidth==0.2.5
webencodings==0.5.1
websocket-client==1.4.2
Werkzeug==2.2.2
widgetsnbextension==4.0.4
wrapt==1.14.1
wurlitzer==3.0.3
xgboost==1.3.3
xgboost-ray==0.1.10
xxhash==3.1.0
yacs==0.1.8
yahp==0.1.3
yarl==1.8.2
zict==2.2.0
zipp==3.11.0
zoopt==0.4.1

My EC2 security groups should already allow in/out traffic over 10001 as well, but I added rules to explicitly allow it and still no luck. Any recommendations @jjyao? I can share my test job script, but can’t share too much else on the cloud environment.

Edit: Also confirmed all nodes and my local environment’s client attempting to connect/run the test job script share the same dependencies as the pip freeze above

ognf · February 2, 2023, 6:49pm

For folks tuning in that’re blocked by this - a workaround is using the ray dashboard port forwarding and ray job submit CLIs together as a workaround. This seems to work to be able to submit jobs to the cluster from your local machine. There’s already so many CLIs to follow though, so it’d be nice if we could connect to the remote cluster and run a job by only changing the ray.init() address

jjyao · February 13, 2023, 9:15pm

Hi @ognf,

Sorry for the late reply. Ray client is no longer the recommended way to run your Ray applications. For development, you can run your script from the head node directly. For production, you can use Ray jobs (i.e. ray job submit). This way you don’t need to worry about mismatched environments between your laptop and the remote cluster.

Topic		Replies	Views
Unable to submit remote function- k8s cluster ray version 1.12.1 Ray Clusters	7	769	February 2, 2023
What if python version doesn't match between client and server Ray Core	5	1929	November 15, 2022
Keeping up with the Python'eses Ray Core	3	287	August 19, 2022
Ray 2.2.0 TimeOutError Ray Core	1	409	April 19, 2023
Unable to connect with ray cluster Ray Clusters	0	239	February 13, 2024

Ray and python versions

Related topics