- High: It blocks me to complete my task.
I am running 1 head node and 1 worker node, and connecting to the head using ray.init(address).
Lately I had several instances where ray.connect timed out, I looked at the ray head node and it looks like it didn’t start properly but it did stay up until i closed it.
Here is the log from the non working ray head instance (after it i will put start of ray head log from a working instance, it is possible to see that in the working instance it is starting the monitor connecting to port 10001 etc…)
running with ray 2.3 on linux
non working ray head log
[2023-04-13 17:28:58,920 I 20 20] (gcs_server) io_service_pool.cc:35: IOServicePool is running with 1 io_service.
[2023-04-13 17:28:58,921 I 20 20] (gcs_server) gcs_server.cc:58: GCS storage type is memory
[2023-04-13 17:28:58,922 I 20 20] (gcs_server) gcs_init_data.cc:44: Loading job table data.
[2023-04-13 17:28:58,922 I 20 20] (gcs_server) gcs_init_data.cc:56: Loading node table data.
[2023-04-13 17:28:58,922 I 20 20] (gcs_server) gcs_init_data.cc:68: Loading cluster resources table data.
[2023-04-13 17:28:58,922 I 20 20] (gcs_server) gcs_init_data.cc:95: Loading actor table data.
[2023-04-13 17:28:58,922 I 20 20] (gcs_server) gcs_init_data.cc:108: Loading actor task spec table data.
[2023-04-13 17:28:58,922 I 20 20] (gcs_server) gcs_init_data.cc:81: Loading placement group table data.
[2023-04-13 17:28:58,922 I 20 20] (gcs_server) gcs_init_data.cc:48: Finished loading job table data, size = 0
[2023-04-13 17:28:58,922 I 20 20] (gcs_server) gcs_init_data.cc:60: Finished loading node table data, size = 0
[2023-04-13 17:28:58,922 I 20 20] (gcs_server) gcs_init_data.cc:72: Finished loading cluster resources table data, size = 0
[2023-04-13 17:28:58,922 I 20 20] (gcs_server) gcs_init_data.cc:99: Finished loading actor table data, size = 0
[2023-04-13 17:28:58,922 I 20 20] (gcs_server) gcs_init_data.cc:112: Finished loading actor task spec table data, size = 0
[2023-04-13 17:28:58,922 I 20 20] (gcs_server) gcs_init_data.cc:86: Finished loading placement group table data, size = 0
[2023-04-13 17:28:58,923 I 20 20] (gcs_server) grpc_server.cc:140: GcsServer server started, listening on port 6385.
[2023-04-13 17:28:58,968 I 20 20] (gcs_server) gcs_server.cc:192: GcsNodeManager:
- RegisterNode request count: 0
- DrainNode request count: 0
- GetAllNodeInfo request count: 0
- GetInternalConfig request count: 0
GcsActorManager:
- RegisterActor request count: 0
- CreateActor request count: 0
- GetActorInfo request count: 0
- GetNamedActorInfo request count: 0
- GetAllActorInfo request count: 0
- KillActor request count: 0
- ListNamedActors request count: 0
- Registered actors count: 0
- Destroyed actors count: 0
- Named actors count: 0
- Unresolved actors count: 0
- Pending actors count: 0
- Created actors count: 0
- owners_: 0
- actor_to_register_callbacks_: 0
- actor_to_create_callbacks_: 0
- sorted_destroyed_actor_list_: 0
GcsResourceManager:
- GetResources request count: 0
- GetAllAvailableResources request count0
- ReportResourceUsage request count: 0
- GetAllResourceUsage request count: 0
GcsPlacementGroupManager:
- CreatePlacementGroup request count: 0
- RemovePlacementGroup request count: 0
- GetPlacementGroup request count: 0
- GetAllPlacementGroup request count: 0
- WaitPlacementGroupUntilReady request count: 0
- GetNamedPlacementGroup request count: 0
- Scheduling pending placement group count: 0
- Registered placement groups count: 0
- Named placement group count: 0
- Pending placement groups count: 0
- Infeasible placement groups count: 0
GcsPublisher {}
[runtime env manager] ID to URIs table:
[runtime env manager] URIs reference table:
GcsTaskManager:
-Total num task events reported: 0
-Total num status task events dropped: 0
-Total num profile events dropped: 0
-Total num bytes of task event stored: 0MiB
-Current num of task events stored: 0
-Total num of actor creation tasks: 0
-Total num of actor tasks: 0
-Total num of normal tasks: 0
-Total num of driver tasks: 0
GrpcBasedResourceBroadcaster:
- Tracked nodes: 0
[2023-04-13 17:28:58,968 I 20 20] (gcs_server) gcs_server.cc:757: Event stats:
Global stats: 19 total (10 active)
Queueing time: mean = 4.818 ms, max = 45.684 ms, min = 3.330 us, total = 91.536 ms
Execution time: mean = 2.411 ms, total = 45.816 ms
Event stats:
GcsInMemoryStore.GetAll - 6 total (0 active), CPU time: mean = 7.631 ms, total = 45.788 ms
InternalKVGcsService.grpc_client.InternalKVPut - 6 total (6 active), CPU time: mean = 0.000 s, total = 0.000 s
PeriodicalRunner.RunFnPeriodically - 4 total (2 active, 1 running), CPU time: mean = 6.255 us, total = 25.018 us
RayletLoadPulled - 1 total (1 active), CPU time: mean = 0.000 s, total = 0.000 s
GcsInMemoryStore.Put - 1 total (0 active), CPU time: mean = 3.029 us, total = 3.029 us
RaySyncer.deadline_timer.report_resource_report - 1 total (1 active), CPU time: mean = 0.000 s, total = 0.000 s
[2023-04-13 17:28:58,969 I 20 20] (gcs_server) gcs_server.cc:758: GcsTaskManager Event stats:
Global stats: 0 total (0 active)
Queueing time: mean = -nan s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
Execution time: mean = -nan s, total = 0.000 s
Event stats:
[2023-04-13 17:29:08,955 W 20 27] (gcs_server) metric_exporter.cc:209: [1] Export metrics to agent failed: GrpcUnavailable: RPC Error message: failed to connect to all addresses; RPC Error details: . This won’t affect Ray, but you can lose metrics from the cluster.
[2023-04-13 17:29:18,604 I 20 20] (gcs_server) gcs_node_manager.cc:42: Registering node info, node id = 0ae1f98ffe4a33dec11ac918968f070f617369fd9e6b00da80615cc7, address = 10.244.0.166, node name = linux-0
[2023-04-13 17:29:18,604 I 20 20] (gcs_server) gcs_node_manager.cc:48: Finished registering node info, node id = 0ae1f98ffe4a33dec11ac918968f070f617369fd9e6b00da80615cc7, address = 10.244.0.166, node name = linux-0
[2023-04-13 17:29:18,604 I 20 20] (gcs_server) gcs_placement_group_manager.cc:763: A new node: 0ae1f98ffe4a33dec11ac918968f070f617369fd9e6b00da80615cc7 registered, will try to reschedule all the infeasible placement groups.
[2023-04-13 17:29:18,612 I 20 20] (gcs_server) gcs_job_manager.cc:149: Getting all job info.
[2023-04-13 17:29:18,613 I 20 20] (gcs_server) gcs_job_manager.cc:218: Finished getting all job info.
[2023-04-13 17:29:58,969 I 20 20] (gcs_server) gcs_server.cc:192: GcsNodeManager:
- RegisterNode request count: 1
- DrainNode request count: 0
- GetAllNodeInfo request count: 3
- GetInternalConfig request count: 2
GcsActorManager:
- RegisterActor request count: 0
- CreateActor request count: 0
- GetActorInfo request count: 0
- GetNamedActorInfo request count: 0
- GetAllActorInfo request count: 0
- KillActor request count: 0
- ListNamedActors request count: 0
- Registered actors count: 0
- Destroyed actors count: 0
- Named actors count: 0
- Unresolved actors count: 0
- Pending actors count: 0
- Created actors count: 0
- owners_: 0
- actor_to_register_callbacks_: 0
- actor_to_create_callbacks_: 0
- sorted_destroyed_actor_list_: 0
GcsResourceManager:
- GetResources request count: 0
- GetAllAvailableResources request count0
- ReportResourceUsage request count: 0
- GetAllResourceUsage request count: 0
GcsPlacementGroupManager:
- CreatePlacementGroup request count: 0
- RemovePlacementGroup request count: 0
- GetPlacementGroup request count: 0
- GetAllPlacementGroup request count: 0
- WaitPlacementGroupUntilReady request count: 0
- GetNamedPlacementGroup request count: 0
- Scheduling pending placement group count: 0
- Registered placement groups count: 0
- Named placement group count: 0
- Pending placement groups count: 0
- Infeasible placement groups count: 0
GcsPublisher {}
[runtime env manager] ID to URIs table:
[runtime env manager] URIs reference table:
GcsTaskManager:
-Total num task events reported: 0
-Total num status task events dropped: 0
-Total num profile events dropped: 0
-Total num bytes of task event stored: 0MiB
-Current num of task events stored: 0
-Total num of actor creation tasks: 0
-Total num of actor tasks: 0
-Total num of normal tasks: 0
-Total num of driver tasks: 0
GrpcBasedResourceBroadcaster:
- Tracked nodes: 1
[2023-04-13 17:29:58,969 I 20 20] (gcs_server) gcs_server.cc:757: Event stats:
Global stats: 2222 total (4 active)
Queueing time: mean = 179.960 us, max = 45.684 ms, min = -0.000 s, total = 399.871 ms
Execution time: mean = 79.811 us, total = 177.339 ms
Event stats:
RaySyncer.deadline_timer.report_resource_report - 599 total (1 active), CPU time: mean = 79.682 us, total = 47.729 ms
NodeManagerService.grpc_client.RequestResourceReport - 391 total (0 active), CPU time: mean = 47.510 us, total = 18.576 ms
ResourceUpdate - 391 total (0 active), CPU time: mean = 18.417 us, total = 7.201 ms
NodeManagerService.grpc_client.UpdateResourceUsage - 390 total (0 active), CPU time: mean = 23.669 us, total = 9.231 ms
GcsInMemoryStore.Put - 80 total (0 active), CPU time: mean = 59.672 us, total = 4.774 ms
InternalKVGcsService.grpc_server.InternalKVPut - 78 total (0 active), CPU time: mean = 13.073 us, total = 1.020 ms
InternalKVGcsService.grpc_client.InternalKVPut - 72 total (0 active), CPU time: mean = 10.127 us, total = 729.157 us
RayletLoadPulled - 60 total (1 active), CPU time: mean = 509.220 us, total = 30.553 ms
GcsInMemoryStore.Get - 42 total (0 active), CPU time: mean = 42.718 us, total = 1.794 ms
InternalKVGcsService.grpc_server.InternalKVGet - 40 total (0 active), CPU time: mean = 43.171 us, total = 1.727 ms
NodeManagerService.grpc_client.GetResourceLoad - 40 total (0 active), CPU time: mean = 20.563 us, total = 822.535 us
HealthCheck - 12 total (0 active), CPU time: mean = 8.093 us, total = 97.118 us
GcsInMemoryStore.GetAll - 8 total (0 active), CPU time: mean = 5.786 ms, total = 46.289 ms
GCSServer.deadline_timer.debug_state_dump - 6 total (1 active), CPU time: mean = 704.157 us, total = 4.225 ms
PeriodicalRunner.RunFnPeriodically - 4 total (0 active), CPU time: mean = 201.656 us, total = 806.625 us
NodeInfoGcsService.grpc_server.GetAllNodeInfo - 3 total (0 active), CPU time: mean = 325.495 us, total = 976.484 us
NodeInfoGcsService.grpc_server.GetInternalConfig - 2 total (0 active), CPU time: mean = 40.109 us, total = 80.218 us
JobInfoGcsService.grpc_server.GetAllJobInfo - 1 total (0 active), CPU time: mean = 573.451 us, total = 573.451 us
NodeInfoGcsService.grpc_server.RegisterNode - 1 total (0 active), CPU time: mean = 125.548 us, total = 125.548 us
GcsHealthCheckManager::AddNode - 1 total (0 active), CPU time: mean = 8.198 us, total = 8.198 us
GCSServer.deadline_timer.debug_state_event_stats_print - 1 total (1 active, 1 running), CPU time: mean = 0.000 s, total = 0.000 s
[2023-04-13 17:29:58,969 I 20 20] (gcs_server) gcs_server.cc:758: GcsTaskManager Event stats:
Global stats: 0 total (0 active)
Queueing time: mean = -nan s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
Execution time: mean = -nan s, total = 0.000 s
Event stats:
Working ray head instance : (start of log is exactly the same as non working instance until starting monitor part)
Global stats: 0 total (0 active)
Queueing time: mean = -nan s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
Execution time: mean = -nan s, total = 0.000 s
Event stats:
2023-04-14 00:21:30,790 INFO (monitor) monitor.py:651 – Starting monitor using ray installation: /usr/local/lib/python3.10/dist-packages/ray/init.py
2023-04-14 00:21:30,790 INFO (monitor) monitor.py:652 – Ray version: 2.3.0
2023-04-14 00:21:30,790 INFO (monitor) monitor.py:653 – Ray commit: cf7a56b4b0b648c324722df7c99c168e92ff0b45
2023-04-14 00:21:30,790 INFO (monitor) monitor.py:654 – Monitor started with command: [‘/usr/local/lib/python3.10/dist-packages/ray/autoscaler/_private/monitor.py’, ‘–logs-dir=/tmp/ray/session_2023-04-14_00-21-28_973433_7/logs’, ‘–logging-rotate-bytes=536870912’, ‘–logging-rotate-backup-count=5’, ‘–gcs-address=10.244.0.134:6385’, ‘–logging-filename=’, ‘–logging-format=%(asctime)s\t%(levelname)s (monitor) %(filename)s:%(lineno)s – %(message)s’, ‘–monitor-ip=10.244.0.134’]
2023-04-14 00:21:30,794 INFO (monitor) monitor.py:167 – session_name: session_2023-04-14_00-21-28_973433_7
2023-04-14 00:21:30,796 INFO (monitor) monitor.py:198 – Starting autoscaler metrics server on port 44217
2023-04-14 00:21:30,797 INFO (monitor) monitor.py:218 – Monitor: Started
2023-04-14 00:21:30,807 INFO (monitor) autoscaler.py:276 – disable_node_updaters:False
2023-04-14 00:21:30,807 INFO (monitor) autoscaler.py:284 – disable_launch_config_check:False
2023-04-14 00:21:30,807 INFO (monitor) autoscaler.py:296 – foreground_node_launch:False
2023-04-14 00:21:30,807 INFO (monitor) autoscaler.py:306 – worker_liveness_check:True
2023-04-14 00:21:30,807 INFO (monitor) autoscaler.py:314 – worker_rpc_drain:True
2023-04-14 00:21:30,808 INFO (monitor) autoscaler.py:364 – StandardAutoscaler: {‘cluster_name’: ‘default’, ‘max_workers’: 0, ‘upscaling_speed’: 1.0, ‘docker’: {}, ‘idle_timeout_minutes’: 0, ‘provider’: {‘type’: ‘readonly’, ‘use_node_id_as_ip’: True}, ‘auth’: {}, ‘available_node_types’: {‘ray.head.default’: {‘resources’: {}, ‘node_config’: {}, ‘max_workers’: 0}}, ‘head_node_type’: ‘ray.head.default’, ‘file_mounts’: {}, ‘cluster_synced_files’: , ‘file_mounts_sync_continuously’: False, ‘rsync_exclude’: , ‘rsync_filter’: , ‘initialization_commands’: , ‘setup_commands’: , ‘head_setup_commands’: , ‘worker_setup_commands’: , ‘head_start_ray_commands’: , ‘worker_start_ray_commands’: , ‘head_node’: {}, ‘worker_nodes’: {}}
2023-04-14 00:21:30,810 INFO (monitor) monitor.py:388 – Autoscaler has not yet received load metrics. Waiting.
2023-04-14 00:21:30,845 INFO (dashboard) head.py:135 – Dashboard head grpc address: 0.0.0.0:43797
2023-04-14 00:21:30,849 INFO (dashboard) head.py:239 – Starting dashboard metrics server on port 44227
2023-04-14 00:21:30,851 INFO (dashboard) utils.py:112 – Get all modules by type: DashboardHeadModule
2023-04-14 00:21:31,032 INFO (dashboard) utils.py:145 – Available modules: [<class ‘ray.dashboard.modules.actor.actor_head.ActorHead’>, <class ‘ray.dashboard.modules.event.event_head.EventHead’>, <class ‘ray.dashboard.modules.healthz.healthz_head.HealthzHead’>, <class ‘ray.dashboard.modules.job.job_head.JobHead’>, <class ‘ray.dashboard.modules.log.log_head.LogHead’>, <class ‘ray.dashboard.modules.metrics.metrics_head.MetricsHead’>, <class ‘ray.dashboard.modules.node.node_head.NodeHead’>, <class ‘ray.dashboard.modules.reporter.reporter_head.ReportHead’>, <class ‘ray.dashboard.modules.snapshot.snapshot_head.APIHead’>, <class ‘ray.dashboard.modules.state.state_head.StateHead’>, <class ‘ray.dashboard.modules.usage_stats.usage_stats_head.UsageStatsHead’>]
2023-04-14 00:21:31,032 INFO (dashboard) head.py:208 – Modules to load: {‘LogHead’, ‘APIHead’, ‘UsageStatsHead’, ‘EventHead’, ‘StateHead’, ‘ActorHead’, ‘MetricsHead’, ‘JobHead’, ‘NodeHead’, ‘ReportHead’, ‘HealthzHead’}
2023-04-14 00:21:31,033 INFO (dashboard) head.py:211 – Loading DashboardHeadModule: <class ‘ray.dashboard.modules.actor.actor_head.ActorHead’>
2023-04-14 00:21:31,033 INFO (dashboard) head.py:211 – Loading DashboardHeadModule: <class ‘ray.dashboard.modules.event.event_head.EventHead’>
2023-04-14 00:21:31,033 INFO (dashboard) head.py:211 – Loading DashboardHeadModule: <class ‘ray.dashboard.modules.healthz.healthz_head.HealthzHead’>
2023-04-14 00:21:31,033 INFO (dashboard) head.py:211 – Loading DashboardHeadModule: <class ‘ray.dashboard.modules.job.job_head.JobHead’>
2023-04-14 00:21:31,033 INFO (dashboard) head.py:211 – Loading DashboardHeadModule: <class ‘ray.dashboard.modules.log.log_head.LogHead’>
2023-04-14 00:21:31,034 INFO (dashboard) head.py:211 – Loading DashboardHeadModule: <class ‘ray.dashboard.modules.metrics.metrics_head.MetricsHead’>
2023-04-14 00:21:31,035 INFO (dashboard) head.py:211 – Loading DashboardHeadModule: <class ‘ray.dashboard.modules.node.node_head.NodeHead’>
2023-04-14 00:21:31,035 INFO (dashboard) head.py:211 – Loading DashboardHeadModule: <class ‘ray.dashboard.modules.reporter.reporter_head.ReportHead’>
2023-04-14 00:21:31,035 INFO (dashboard) head.py:211 – Loading DashboardHeadModule: <class ‘ray.dashboard.modules.snapshot.snapshot_head.APIHead’>
2023-04-14 00:21:31,035 INFO (dashboard) head.py:211 – Loading DashboardHeadModule: <class ‘ray.dashboard.modules.state.state_head.StateHead’>
2023-04-14 00:21:31,035 INFO (dashboard) head.py:211 – Loading DashboardHeadModule: <class ‘ray.dashboard.modules.usage_stats.usage_stats_head.UsageStatsHead’>
2023-04-14 00:21:31,035 INFO (dashboard) head.py:224 – Loaded 11 modules. [<ray.dashboard.modules.actor.actor_head.ActorHead object at 0x7fb5e687e1d0>, <ray.dashboard.modules.event.event_head.EventHead object at 0x7fb5f63af100>, <ray.dashboard.modules.healthz.healthz_head.HealthzHead object at 0x7fb5f63af0d0>, <ray.dashboard.modules.job.job_head.JobHead object at 0x7fb5e5ceb9d0>, <ray.dashboard.modules.log.log_head.LogHead object at 0x7fb5e5ceba30>, <ray.dashboard.modules.metrics.metrics_head.MetricsHead object at 0x7fb5e5cebac0>, <ray.dashboard.modules.node.node_head.NodeHead object at 0x7fb5e5cebfa0>, <ray.dashboard.modules.reporter.reporter_head.ReportHead object at 0x7fb5e5cebf70>, <ray.dashboard.modules.snapshot.snapshot_head.APIHead object at 0x7fb5e5d24220>, <ray.dashboard.modules.state.state_head.StateHead object at 0x7fb5e5d24250>, <ray.dashboard.modules.usage_stats.usage_stats_head.UsageStatsHead object at 0x7fb5e5d24340>]
2023-04-14 00:21:31,035 INFO (dashboard) head.py:301 – Initialize the http server.
2023-04-14 00:21:31,035 INFO (dashboard) http_server_head.py:85 – Setup static dir for dashboard: /usr/local/lib/python3.10/dist-packages/ray/dashboard/client/build
2023-04-14 00:21:31,039 INFO (dashboard) http_server_head.py:204 – Dashboard head http address: 10.244.0.134:8265
2023-04-14 00:21:31,039 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /logical/actors> → <function ActorHead.get_all_actors[cache ttl=2, max_size=128] at 0x7fb5e6f070a0>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <DynamicResource /logical/actors/{actor_id}> → <function ActorHead.get_actor[cache ttl=2, max_size=128] at 0x7fb5e6f07250>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /events> → <function EventHead.get_event[cache ttl=2, max_size=128] at 0x7fb5e6837520>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/gcs_healthz> → <function HealthzHead.health_check at 0x7fb5e68540d0>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/version> → <function JobHead.get_version at 0x7fb5e6898c10>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <DynamicResource /api/packages/{protocol}/{package_name}> → <function JobHead.get_package at 0x7fb5e6898d30>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [PUT] <DynamicResource /api/packages/{protocol}/{package_name}> → <function JobHead.upload_package at 0x7fb5e6898e50>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [POST] <PlainResource /api/jobs/> → <function JobHead.submit_job at 0x7fb5e6898f70>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [POST] <DynamicResource /api/jobs/{job_or_submission_id}/stop> → <function JobHead.stop_job at 0x7fb5e6899090>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [DELETE] <DynamicResource /api/jobs/{job_or_submission_id}> → <function JobHead.delete_job at 0x7fb5e68991b0>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <DynamicResource /api/jobs/{job_or_submission_id}> → <function JobHead.get_job_info at 0x7fb5e68992d0>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/jobs/> → <function JobHead.list_jobs at 0x7fb5e68993f0>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <DynamicResource /api/jobs/{job_or_submission_id}/logs> → <function JobHead.get_job_logs at 0x7fb5e6899510>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <DynamicResource /api/jobs/{job_or_submission_id}/logs/tail> → <function JobHead.tail_job_logs at 0x7fb5e6899630>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /log_index> → <function LogHead.get_log_index at 0x7fb5e689ae60>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /log_proxy> → <function LogHead.get_log_from_proxy at 0x7fb5e689af80>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/grafana_health> → <function MetricsHead.grafana_health at 0x7fb5e5f2a680>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/prometheus_health> → <function MetricsHead.prometheus_health at 0x7fb5e5f2a7a0>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/progress> → <function MetricsHead.get_progress at 0x7fb5e5f2a8c0>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/progress_by_task_name> → <function MetricsHead.get_progress_by_task_name at 0x7fb5e5f2a9e0>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /internal/node_module> → <function NodeHead.get_node_module_internal_state at 0x7fb5e5f2b520>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /nodes> → <function NodeHead.get_all_nodes[cache ttl=2, max_size=128] at 0x7fb5e5f2b640>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <DynamicResource /nodes/{node_id}> → <function NodeHead.get_node[cache ttl=2, max_size=128] at 0x7fb5e5f2b7f0>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/v0/cluster_metadata> → <function ReportHead.get_cluster_metadata at 0x7fb5e5e8a4d0>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/cluster_status> → <function ReportHead.get_cluster_status at 0x7fb5e5e8a5f0>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /worker/traceback> → <function ReportHead.get_traceback at 0x7fb5e5e8a710>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /worker/cpu_profile> → <function ReportHead.cpu_profile at 0x7fb5e5e8a830>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/actors/kill> → <function APIHead.kill_actor_gcs at 0x7fb5e5ccd870>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/snapshot> → <function APIHead.snapshot at 0x7fb5e5ccd990>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/component_activities> → <function APIHead.get_component_activities at 0x7fb5e5ccdab0>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/v0/actors> → <function RateLimitedModule.enforce_max_concurrent_calls..async_wrapper at 0x7fb5e5ccf9a0>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/v0/jobs> → <function RateLimitedModule.enforce_max_concurrent_calls..async_wrapper at 0x7fb5e5ccfb50>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/v0/nodes> → <function RateLimitedModule.enforce_max_concurrent_calls..async_wrapper at 0x7fb5e5ccfd00>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/v0/placement_groups> → <function RateLimitedModule.enforce_max_concurrent_calls..async_wrapper at 0x7fb5e5ccfeb0>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/v0/workers> → <function RateLimitedModule.enforce_max_concurrent_calls..async_wrapper at 0x7fb5e5cf00d0>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/v0/tasks> → <function RateLimitedModule.enforce_max_concurrent_calls..async_wrapper at 0x7fb5e5cf0280>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/v0/objects> → <function RateLimitedModule.enforce_max_concurrent_calls..async_wrapper at 0x7fb5e5cf0430>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/v0/runtime_envs> → <function RateLimitedModule.enforce_max_concurrent_calls..async_wrapper at 0x7fb5e5cf05e0>
2023-04-14 00:21:31,040 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/v0/cluster_events> → <function RateLimitedModule.enforce_max_concurrent_calls..async_wrapper at 0x7fb5e5cf0790>
2023-04-14 00:21:31,041 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/v0/logs> → <function RateLimitedModule.enforce_max_concurrent_calls..async_wrapper at 0x7fb5e5cf0940>
2023-04-14 00:21:31,041 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <DynamicResource /api/v0/logs/{media_type}> → <function RateLimitedModule.enforce_max_concurrent_calls..async_wrapper at 0x7fb5e5cf0af0>
2023-04-14 00:21:31,041 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/v0/tasks/summarize> → <function RateLimitedModule.enforce_max_concurrent_calls..async_wrapper at 0x7fb5e5cf0d30>
2023-04-14 00:21:31,041 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/v0/actors/summarize> → <function RateLimitedModule.enforce_max_concurrent_calls..async_wrapper at 0x7fb5e5cf0ee0>
2023-04-14 00:21:31,041 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/v0/objects/summarize> → <function RateLimitedModule.enforce_max_concurrent_calls..async_wrapper at 0x7fb5e5cf1090>
2023-04-14 00:21:31,041 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /api/v0/tasks/timeline> → <function RateLimitedModule.enforce_max_concurrent_calls..async_wrapper at 0x7fb5e5cf1240>
2023-04-14 00:21:31,041 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <DynamicResource /api/v0/delay/{delay_s}> → <function StateHead.delayed_response at 0x7fb5e5cf1360>
2023-04-14 00:21:31,041 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /usage_stats_enabled> → <function UsageStatsHead.get_usage_stats_enabled at 0x7fb5e5cf30a0>
2023-04-14 00:21:31,041 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <StaticResource /logs → PosixPath(‘/tmp/ray/session_2023-04-14_00-21-28_973433_7/logs’)> → <bound method StaticResource._handle of <StaticResource /logs → PosixPath(‘/tmp/ray/session_2023-04-14_00-21-28_973433_7/logs’)>>
2023-04-14 00:21:31,041 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] → <function HttpServerDashboardHead.get_index at 0x7fb5e5cf37f0>
2023-04-14 00:21:31,041 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <PlainResource /favicon.ico> → <function HttpServerDashboardHead.get_favicon at 0x7fb5e5cf3910>
2023-04-14 00:21:31,041 INFO (dashboard) http_server_head.py:210 – <ResourceRoute [GET] <StaticResource /static → PosixPath(‘/usr/local/lib/python3.10/dist-packages/ray/dashboard/client/build/static’)> → <bound method StaticResource._handle of <StaticResource /static → PosixPath(‘/usr/local/lib/python3.10/dist-packages/ray/dashboard/client/build/static’)>>
2023-04-14 00:21:31,041 INFO (dashboard) http_server_head.py:211 – Registered 51 routes.
2023-04-14 00:21:31,044 INFO (dashboard) event_utils.py:132 – Monitor events logs modified after 1681429890.9484723 on /tmp/ray/session_2023-04-14_00-21-28_973433_7/logs/events, the source types are all.
2023-04-14 00:21:31,052 INFO (dashboard) usage_stats_head.py:168 – Usage reporting is disabled.
2023-04-14 00:21:31,053 INFO (dashboard) actor_head.py:101 – Getting all actor info from GCS.
2023-04-14 00:21:31,060 INFO (dashboard) actor_head.py:123 – Received 0 actor info from GCS.
[2023-04-14 00:21:31,284 I 164 164] (raylet) io_service_pool.cc:35: IOServicePool is running with 1 io_service.
[2023-04-14 00:21:31,284 I 164 164] (raylet) store_runner.cc:32: Allowing the Plasma store to use up to 15.0921GB of memory.
[2023-04-14 00:21:31,284 I 164 164] (raylet) store_runner.cc:48: Starting object store with directory /dev/shm, fallback /tmp/ray, and huge page support disabled
[2023-04-14 00:21:31,284 I 164 205] (raylet) dlmalloc.cc:154: create_and_mmap_buffer(15092154376, /dev/shm/plasmaXXXXXX)
[2023-04-14 00:21:31,285 I 164 205] (raylet) store.cc:554: ========== Plasma store: =================