Gcs_server.out file filling up with Couldn't get resource request from raylet

Using the latest Ray 1.7.1 (also tried 1.7.0), running a ray cluster (local laptop and K8S), I noticed that gcs_server.out file is quickly filling up with the following annoying Info-level message.

[2021-10-27 16:48:50,643 I 17605 680999] gcs_resource_report_poller.cc:138: Couldn't get resource request from raylet af993d005e5e2fce5630439673216e4b29f806bfb525f398672564b8: IOError: 14: failed to connect to all addresses
[2021-10-27 16:48:50,643 I 17605 680999] gcs_resource_report_poller.cc:138: Couldn't get resource request from raylet f56065af092b3da0050551b70793a6a9f71bf29e157397681b9004cb: IOError: 14: failed to connect to all addresses
[2021-10-27 16:48:50,643 I 17605 680999] gcs_resource_report_poller.cc:138: Couldn't get resource request from raylet d609b55b8d7e5e8415ed484f03ccbfaf697fea6b9b6322e5e900008f: IOError: 14: failed to connect to all addresses

All ray services started and running. I tried starting ray with logging-level=‘error’ option but no effect and getting ignored.

How do I stop this annoying message? Is there way to set logging-level? Please, help me to resolve this ASAP. Thanks

setting export RAY_BACKEND_LOG_LEVEL=error; fix this issue.

In dashboard.log file, I see this error.

2021-10-27 17:21:13,146 ERROR base_events.py:1619 -- Exception in callback PollerCompletionQueue._handle_events(<_UnixSelecto...e debug=False>)()
handle: <Handle PollerCompletionQueue._handle_events(<_UnixSelecto...e debug=False>)()>
Traceback (most recent call last):
  File ".pyenv/versions/3.7.12/lib/python3.7/asyncio/events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "src/python/grpcio/grpc/_cython/_cygrpc/aio/completion_queue.pyx.pxi", line 147, in grpc._cython.cygrpc.PollerCompletionQueue._handle_events
BlockingIOError: [Errno 35] Resource temporarily unavailable

Any idea what cause this error?

Would you mind trying with Ray 1.8.0 to see if you still get this?

Yes, I tried it with Ray 1.8.0 and not getting error message in dashboard.log however still seeing the same issue with gcs_server.out filling up every minute. The same thing is happening with dashboard.log

2021-11-05 14:46:49,231	INFO node_head.py:257 -- Received a log for 10.244.2.11 and autoscaler
2021-11-05 14:46:54,181	INFO node_head.py:257 -- Received a log for 10.244.2.11 and autoscaler
2021-11-05 14:47:04,162	INFO node_head.py:257 -- Received a log for 10.244.2.11 and autoscaler
2021-11-05 14:47:09,208	INFO node_head.py:257 -- Received a log for 10.244.2.11 and autoscaler
2021-11-05 14:47:14,159	INFO node_head.py:257 -- Received a log for 10.244.2.11 and autoscaler
2021-11-05 14:47:19,206	INFO node_head.py:257 -- Received a log for 10.244.2.11 and autoscaler
2021-11-05 14:47:24,253	INFO node_head.py:257 -- Received a log for 10.244.2.11 and autoscaler
2021-11-05 14:47:34,216	INFO node_head.py:257 -- Received a log for 10.244.2.11 and autoscaler
2021-11-05 14:47:39,170	INFO node_head.py:257 -- Received a log for 10.244.2.11 and autoscaler
2021-11-05 14:47:44,215	INFO node_head.py:257 -- Received a log for 10.244.2.11 and autoscaler
[2021-11-05 14:50:54,519 I 13 13] gcs_resource_report_poller.cc:138: Couldn't get resource request from raylet e328fddcd75ca11645b3d522cda29d7bfa2f91536d27cd70bf42888f: IOError: 14: failed to connect to all addresses
[2021-11-05 14:51:01,161 I 13 13] gcs_resource_report_poller.cc:138: Couldn't get resource request from raylet 4ec9c626cfc499294131bd8069b42e17d1b8f8847eeb351c160a2eff: IOError: 14: failed to connect to all addresses

NOTE: dashboard won’t start in ray 1.8.0 due to #19940. I had to downgrade aiohttp <3.8 to get it work.

“Received a log” looks normal.
cc @Alex for “Couldn’t get resource request” – that looks potentially concerning.