Streaming_split map_tasks stuck in pending node assignment forever

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I am running a cluster with 4 nodes, using Ray 2.4, run script directly on a cluster node (e.g. after SSHing into the node using [ray attach], when doing dataset.streaming_split, the whole iteration hangs, and I
can see that the _map_task is always in PENDING_NODE_ASSIGNMENT state.

======== Autoscaler status: 2023-10-23 20:39:10.214429 ========

Node status


Healthy:

1 node_2bcffc4a1251dcee379c11bacebf3f302c84cfcda5bdb9f5687e0ea0

1 node_f5bff15cbdcfe0206f6520d357c11bd55bad5593072be0ea79f2ac48

1 node_56b769cf5a7517a4c39ee0a02a110ff0bedee2db4e71ae37258eabf2

1 node_9c7759490693b4d6f90fb53404b9d863c79046a8640e95f5b66fa9b4

Pending:

(no pending nodes)

Recent failures:

(no failures)

Resources


Usage:

4.0/32.0 CPU

0B/60.01GiB memory

1.81GiB/16.00GiB object_store_memory

Demands:

{‘CPU’: 1}: 39+ from request_resources()