I found the location of logs just now and realized that GCS indeed started on 8084 as requested.
bash-4.2$ cat /tmp/ray/session_latest/logs/gcs_server.out
[2022-02-06 06:07:22,358 I 293 293] io_service_pool.cc:35: IOServicePool is running with 1 io_service.
[2022-02-06 06:07:22,358 I 293 293] gcs_redis_failure_detector.cc:30: Starting redis failure detector.
[2022-02-06 06:07:22,358 I 293 293] gcs_init_data.cc:42: Loading job table data.
[2022-02-06 06:07:22,358 I 293 293] gcs_init_data.cc:54: Loading node table data.
[2022-02-06 06:07:22,358 I 293 293] gcs_init_data.cc:66: Loading cluster resources table data.
[2022-02-06 06:07:22,358 I 293 293] gcs_init_data.cc:93: Loading actor table data.
[2022-02-06 06:07:22,358 I 293 293] gcs_init_data.cc:79: Loading placement group table data.
[2022-02-06 06:07:22,359 I 293 293] gcs_init_data.cc:46: Finished loading job table data, size = 0
[2022-02-06 06:07:22,359 I 293 293] gcs_init_data.cc:58: Finished loading node table data, size = 0
[2022-02-06 06:07:22,359 I 293 293] gcs_init_data.cc:70: Finished loading cluster resources table data, size = 0
[2022-02-06 06:07:22,359 I 293 293] gcs_init_data.cc:97: Finished loading actor table data, size = 0
[2022-02-06 06:07:22,359 I 293 293] gcs_init_data.cc:84: Finished loading placement group table data, size = 0
[2022-02-06 06:07:22,359 I 293 293] gcs_heartbeat_manager.cc:30: GcsHeartbeatManager start, num_heartbeats_timeout=30
[2022-02-06 06:07:22,374 I 293 293] grpc_server.cc:112: GcsServer server started, listening on port 8084.
[2022-02-06 06:07:22,384 I 293 293] gcs_server.cc:339: Gcs server address = 100.96.132.173:8084
[2022-02-06 06:07:22,384 I 293 293] gcs_server.cc:343: Finished setting gcs server address: 100.96.132.173:8084
[2022-02-06 06:07:22,384 I 293 293] gcs_server.cc:531: GcsNodeManager: {RegisterNode request count: 0, DrainNode request count: 0, GetAllNodeInfo request count: 0, GetInternalConfig request count: 0}
GcsActorManager: {RegisterActor request count: 0, CreateActor request count: 0, GetActorInfo request count: 0, GetNamedActorInfo request count: 0, GetAllActorInfo request count: 0, KillActor request count: 0, ListNamedActors request count: 0, Registered actors count: 0, Destroyed actors count: 0, Named actors count: 0, Unresolved actors count: 0, Pending actors count: 0, Created actors count: 0}
GcsPlacementGroupManager: {CreatePlacementGroup request count: 0, RemovePlacementGroup request count: 0, GetPlacementGrouprequest count: 0, GetAllPlacementGroup request count: 0, WaitPlacementGroupUntilReady request count: 0, GetNamedPlacementGroup request count: 0, Scheduling pending placement group count: 0, Registered placement groups count: 0, Named placement group count: 0, Pending placement groups count: 0}
GcsPubSub:
- num channels subscribed to: 0
- total commands queued: 0
DefaultTaskInfoHandler: {AddTask request count: 0, GetTask request count: 0, AddTaskLease request count: 0, GetTaskLease request count: 0, AttemptTaskReconstruction request count: 0}
GrpcBasedResourceBroadcaster: {Tracked nodes: 0}
[2022-02-06 06:07:23,398 I 293 293] gcs_node_manager.cc:42: Registering node info, node id = d0ea1c5387f81ad72023fe9b12b24a9734b3ec95bb33220c5bac20d4, address = 100.96.132.173
[2022-02-06 06:07:23,398 I 293 293] gcs_node_manager.cc:47: Finished registering node info, node id = d0ea1c5387f81ad72023fe9b12b24a9734b3ec95bb33220c5bac20d4, address = 100.96.132.173
[2022-02-06 06:07:23,398 I 293 293] gcs_placement_group_manager.cc:722: A new node: d0ea1c5387f81ad72023fe9b12b24a9734b3ec95bb33220c5bac20d4 registered, will try to reschedule all the infeasible placement groups.
[2022-02-06 06:07:23,399 I 293 293] gcs_job_manager.cc:140: Getting all job info.
[2022-02-06 06:07:23,400 I 293 293] gcs_job_manager.cc:146: Finished getting all job info.
Similarly, NodeManager also started along with Redis servers. Not sure now why I see errors like
2022-02-05 16:39:16,665 ERROR gcs_utils.py:137 -- Failed to send request to gcs, reconnecting. Error <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses"
debug_error_string = "{"created":"@1644079156.664823970","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3134,"referenced_errors":[{"created":"@1644079156.664822821","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}"
on the worker now, even though I’m able to talk to a simple.http server on 8084 from the same pair of hosts.