We updated ray on k8s cluster from 2.7.1 to 2.8.1 (tried 2.9.2 too).
Base image rayproject/ray:2.7.1-py38 → rayproject/ray:2.8.1-py38
Kuberay was updated from 1.0.0rc1 to 1.0.0
After that in dashboard we can’t see the nodes summary.
The endpoint returns the following response:
{
"result": false,
"msg": "Traceback (most recent call last):\n File \"/home/ray/anaconda3/lib/python3.8/site-packages/ray/dashboard/optional_utils.py\", line 224, in _update_cache\n response = task.result()\n File \"/home/ray/anaconda3/lib/python3.8/site-packages/ray/dashboard/modules/node/node_head.py\", line 311, in get_all_nodes\n all_node_summary, nodes_logical_resources = await asyncio.gather(\n File \"/home/ray/anaconda3/lib/python3.8/site-packages/ray/dashboard/datacenter.py\", line 173, in get_all_node_summary\n return [\n File \"/home/ray/anaconda3/lib/python3.8/site-packages/ray/dashboard/datacenter.py\", line 174, in <listcomp>\n await DataOrganizer.get_node_info(node_id, get_summary=True)\n File \"/home/ray/anaconda3/lib/python3.8/site-packages/ray/dashboard/datacenter.py\", line 146, in get_node_info\n node_info[\"status\"] = node[\"stateSnapshot\"][\"state\"]\n File \"/home/ray/anaconda3/lib/python3.8/site-packages/ray/dashboard/utils.py\", line 426, in __getitem__\n proxy = self._proxy[item] = make_immutable(self._dict[item])\nKeyError: 'stateSnapshot'\n",
"data": {}
}
And because of that we don’t see cluster status.
Any hints on how to solve this or what can lead to such behavior?
Would be glad to provide additional info, though not sure what should I additionally provide now.
Thanks in advance!
How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.