Ray read_iceberg doesn't scale at large iceberg table

How severe does this issue affect your experience of using Ray?
High

  • None: Just asking a question out of curiosity
  • Low: It annoys or frustrates me for a moment.
  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.
  • High: It blocks me to complete my task.

We are trying to scale up our Ray data processing pipeline. However, the read_iceberg function (which interacts with our AWS Glue Data Catalog on S3) doesn’t seem to scale with the size of the table or the cluster.

When reading a small table with 100k rows or less, everything works as expected. However, when we point it to our production table, the process completely freezes with no logs or progress updates in the log.

here is how we call read iceberg

ds = ray.data.read_iceberg(
table_identifier=table_identifier,
catalog_kwargs={“name”: catalog_name, “type”: “glue”},
)

Any ideas or pointers to help debug the issue would be nice.