Hello, I am trying to run ray with minikube for some tests, the workers that I am spawning are in a pending state, and below is the snippet of the log file, can you please help how to resolve the error:
======== Autoscaler status: 2021-04-12 08:19:50.427483 ========
Node status
Healthy:
1 head-node
Pending:
None: worker-node, waiting-for-ssh
None: worker-node, waiting-for-ssh
Recent failures:
(no failures)
Resources
Usage:
0.0/1.0 CPU
0.00/0.350 GiB memory
0.00/0.136 GiB object_store_memory
Demands:
(no resource demands)
example-cluster:2021-04-12 08:19:50,440 DEBUG legacy_info_string.py:24 β Cluster status: 2 nodes (2 updating)
- MostDelayedHeartbeats: {β172.17.0.5β: 0.1501305103302002}
- NodeIdleSeconds: Min=41 Mean=41 Max=41
- ResourceUsage: 0.0/1.0 CPU, 0.0 GiB/0.35 GiB memory, 0.0 GiB/0.14 GiB object_store_memory
- TimeSinceLastHeartbeat: Min=0 Mean=0 Max=0
Worker node types: - worker-node: 2
example-cluster:2021-04-12 08:19:54,663 INFO command_runner.py:172 β NodeUpdater: example-cluster-ray-worker-5xm2k: Running kubectl -n ray exec -it example-cluster-ray-worker-5xm2k β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β
example-cluster:2021-04-12 08:19:54,672 INFO command_runner.py:172 β NodeUpdater: example-cluster-ray-worker-7gbrc: Running kubectl -n ray exec -it example-cluster-ray-worker-7gbrc β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β
Unable to use a TTY - input is not a terminal or the right kind of file
Error from server (BadRequest): pod example-cluster-ray-worker-5xm2k does not have a host assigned
2021-04-12 08:18:58,714 INFO commands.py:238 β Cluster: example-cluster
2021-04-12 08:18:58,734 INFO commands.py:301 β Checking Kubernetes environment settings
2021-04-12 08:18:58,764 INFO commands.py:573 β No head node found. Launching a new cluster. Confirm [y/N]: y [automatic, due to --yes]
2021-04-12 08:18:58,765 INFO commands.py:618 β Acquiring an up-to-date head node
2021-04-12 08:18:58,780 INFO commands.py:640 β Launched a new head node
2021-04-12 08:18:58,780 INFO commands.py:644 β Fetching the new head node
2021-04-12 08:18:58,787 INFO commands.py:663 β <1/1> Setting up head node
2021-04-12 08:18:58,801 INFO updater.py:286 β New status: waiting-for-ssh
2021-04-12 08:18:58,801 INFO updater.py:234 β [1/7] Waiting for SSH to become available
2021-04-12 08:18:58,801 INFO updater.py:237 β Runninguptime
as a test.
2021-04-12 08:18:58,970 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-head-npc4d β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
2021-04-12 08:19:04,468 SUCC updater.py:245 β Success.
2021-04-12 08:19:04,468 INFO log_timer.py:27 β NodeUpdater: example-cluster-ray-head-npc4d: Got remote shell [LogTimer=5667ms]
2021-04-12 08:19:04,475 INFO updater.py:327 β Updating cluster configuration. [hash=c0249662bba0236e226fdb78e359c1ce573f3257]
2021-04-12 08:19:04,489 INFO updater.py:331 β New status: syncing-files
2021-04-12 08:19:04,489 INFO updater.py:212 β [2/7] Processing file mounts
2021-04-12 08:19:04,489 INFO updater.py:229 β [3/7] No worker file mounts to sync
2021-04-12 08:19:04,500 INFO updater.py:342 β New status: setting-up
2021-04-12 08:19:04,500 INFO updater.py:380 β [4/7] No initialization commands to run.
2021-04-12 08:19:04,500 INFO updater.py:384 β [5/7] Initalizing command runner
2021-04-12 08:19:04,501 INFO updater.py:429 β [6/7] No setup commands to run.
2021-04-12 08:19:04,501 INFO updater.py:433 β [7/7] Starting the Ray runtime
2021-04-12 08:19:08,414 INFO log_timer.py:27 β NodeUpdater: example-cluster-ray-head-npc4d: Ray start commands succeeded [LogTimer=3913ms]
2021-04-12 08:19:08,414 INFO log_timer.py:27 β NodeUpdater: example-cluster-ray-head-npc4d: Applied config c0249662bba0236e226fdb78e359c1ce573f3257 [LogTimer=9626ms]
2021-04-12 08:19:08,435 INFO updater.py:161 β New status: up-to-date
2021-04-12 08:19:08,438 INFO commands.py:742 β Useful commands
2021-04-12 08:19:08,438 INFO commands.py:744 β Monitor autoscaling with
2021-04-12 08:19:08,438 INFO commands.py:747 β ray exec /home/ray/ray_cluster_configs/example-cluster_config.yaml βtail -n 100 -f /tmp/ray/session_latest/logs/monitor*β
2021-04-12 08:19:08,438 INFO commands.py:749 β Connect to a terminal on the cluster head:
2021-04-12 08:19:08,438 INFO commands.py:751 β ray attach /home/ray/ray_cluster_configs/example-cluster_config.yaml
2021-04-12 08:19:08,438 INFO commands.py:754 β Get a remote shell to the cluster manually:
2021-04-12 08:19:08,438 INFO commands.py:755 β kubectl -n ray exec -it example-cluster-ray-head-npc4d β bash
2021-04-12 08:19:13,812 INFO updater.py:286 β New status: waiting-for-ssh
2021-04-12 08:19:13,812 INFO updater.py:234 β [1/7] Waiting for SSH to become available
2021-04-12 08:19:13,812 INFO updater.py:237 β Runninguptime
as a test.
2021-04-12 08:19:13,816 INFO updater.py:286 β New status: waiting-for-ssh
2021-04-12 08:19:13,816 INFO updater.py:234 β [1/7] Waiting for SSH to become available
2021-04-12 08:19:13,816 INFO updater.py:237 β Runninguptime
as a test.
2021-04-12 08:19:13,940 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-worker-5xm2k β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
2021-04-12 08:19:13,982 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-worker-7gbrc β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
2021-04-12 08:19:19,069 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-worker-5xm2k β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
2021-04-12 08:19:19,119 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-worker-7gbrc β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
2021-04-12 08:19:24,140 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-worker-5xm2k β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
2021-04-12 08:19:24,179 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-worker-7gbrc β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
2021-04-12 08:19:29,245 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-worker-5xm2k β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
2021-04-12 08:19:29,279 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-worker-7gbrc β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
2021-04-12 08:19:34,353 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-worker-5xm2k β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
2021-04-12 08:19:34,377 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-worker-7gbrc β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
2021-04-12 08:19:39,451 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-worker-5xm2k β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
2021-04-12 08:19:39,474 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-worker-7gbrc β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
2021-04-12 08:19:44,561 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-worker-5xm2k β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
2021-04-12 08:19:44,576 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-worker-7gbrc β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
2021-04-12 08:19:49,671 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-worker-5xm2k β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
2021-04-12 08:19:49,684 INFO updater.py:277 β SSH still not available (Exit Status 1): kubectl -n ray exec -it example-cluster-ray-worker-7gbrc β bash --login -c -i βtrue && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (uptime)β, retrying in 5 seconds.
Unable to use a TTY - input is not a terminal or the right kind of file
Error from server (BadRequest): pod example-cluster-ray-worker-7gbrc does not have a host assigned
example-cluster:2021-04-12 08:19:55,556 DEBUG resource_demand_scheduler.py:158 β Cluster resources: [{βobject_store_memoryβ: 145933516.0, βmemoryβ: 375809638.0, βCPUβ: 1.0, βnode:172.17.0.5β: 1.0}, {βCPUβ: 1, βbarβ: 1, βfooβ: 1, βmemoryβ: 375809638}, {βCPUβ: 1, βbarβ: 1, βfooβ: 1, βmemoryβ: 375809638}]
example-cluster:2021-04-12 08:19:55,556 DEBUG resource_demand_scheduler.py:159 β Node counts: defaultdict(<class βintβ>, {βhead-nodeβ: 1, βworker-nodeβ: 2})
example-cluster:2021-04-12 08:19:55,556 DEBUG resource_demand_scheduler.py:170 β Placement group demands: []
example-cluster:2021-04-12 08:19:55,556 DEBUG resource_demand_scheduler.py:216 β Resource demands: []
example-cluster:2021-04-12 08:19:55,557 DEBUG resource_demand_scheduler.py:217 β Unfulfilled demands: []
example-cluster:2021-04-12 08:19:55,577 DEBUG resource_demand_scheduler.py:239 β Node requests: {}
example-cluster:2021-04-12 08:19:55,605 INFO autoscaler.py:325 β
======== Autoscaler status: 2021-04-12 08:19:55.605627 ========
Node status
Healthy:
1 head-node
Pending:
None: worker-node, waiting-for-ssh
None: worker-node, waiting-for-ssh
Recent failures:
(no failures)
Resources
Usage:
0.0/1.0 CPU
0.00/0.350 GiB memory
0.00/0.136 GiB object_store_memory
Demands:
(no resource demands)
example-cluster:2021-04-12 08:19:55,618 DEBUG legacy_info_string.py:24 β Cluster status: 2 nodes (2 updating)
- MostDelayedHeartbeats: {β172.17.0.5β: 0.15698885917663574}
- NodeIdleSeconds: Min=47 Mean=47 Max=47
- ResourceUsage: 0.0/1.0 CPU, 0.0 GiB/0.35 GiB memory, 0.0 GiB/0.14 GiB object_store_memory
- TimeSinceLastHeartbeat: Min=0 Mean=0 Max=0
Worker node types: - worker-node: 2
$ kubectl -n ray get pods
NAME READY STATUS RESTARTS AGE
example-cluster-ray-head-npc4d 1/1 Running 0 4m42s
example-cluster-ray-worker-5xm2k 0/1 Pending 0 4m32s
example-cluster-ray-worker-7gbrc 0/1 Pending 0 4m32s
ray-operator-pod 1/1 Running 3 18m