I have a K8s cluster running and trying to deploy rayproject/ray-ml:nightly nodes for both head and workers. Currently when I run ray monitor cluster.yaml I’m seeing the following stack trace after ray up completes
2021-04-01 19:28:44,147 ERROR monitor.py:245 -- Error in monitor loop
Traceback (most recent call last):
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/monitor.py", line 276, in run
self._initialize_autoscaler()
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/monitor.py", line 125, in _initialize_autoscaler
event_summarizer=self.event_summarizer)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/autoscaler.py", line 86, in __init__
self.reset(errors_fatal=True)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/autoscaler.py", line 521, in reset
raise e
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/autoscaler.py", line 479, in reset
self.config["cluster_name"])
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/providers.py", line 186, in _get_node_provider
provider_cls = _get_node_provider_cls(provider_config)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/providers.py", line 162, in _get_node_provider_cls
return importer(provider_config)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/providers.py", line 54, in _import_kubernetes
from ray.autoscaler._private.kubernetes.node_provider import \
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/kubernetes/__init__.py", line 1, in <module>
import kubernetes
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/kubernetes/__init__.py", line 2, in <module>
from kubernetes.config.config_exception import ConfigException
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/kubernetes/config.py", line 6, in <module>
from kubernetes import client
ImportError: cannot import name 'client' from 'kubernetes' (/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/kubernetes/__init__.py)
I’ve connected to the container, run python and was able to import both kubernetes and able to run from kubernetes import client, so I’m fresh out of options what to do next.
The installed kubernetes version is
>>> kubernetes.__version__
'12.0.1'
If I drop back to rayproject/ray-ml:1.3.0 I don’t see this error, but I run into different rllib errors.