Adjusting number of pending tasks per actor?

1. Severity of the issue: (select one)
None: I’m just curious or want clarification.
Low: Annoying but doesn’t hinder my work.
Medium: Significantly affects my productivity but can find a workaround.
High: Completely blocks me.

2. Environment:

  • Ray version: 2.47.1
  • Python version: v3.12.11
  • OS: Linux
  • Cloud/Infrastructure: 8x A100 40GB Node
  • Other libs/tools (if relevant):

3. What happened vs. what you expected:

  • Expected: Reduce number of assigned tasks to actors (4 → 2)
  • Actual: Always 4 tasks per actors are assigned

I ran map_batches (DoclingConverter) with 126 actors.

Running Dataset: dataset_3_0. Active & requested resources: 0/96 CPU, 7/7 GPU, 1.9MB/93.1GB object store:   3%|███▋                                                                                                                            | 111/3.83k [01:59<20:34, 3.01 row/s]
- StreamingRepartition: Tasks: 0; Actors: 0; Queued blocks: 0; Resources: 0.0 CPU, 1.9MB object store: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.83k/3.83k [01:57<00:00, 372 row/s]
- MapBatches(DoclingConverter): Tasks: 504; Actors: 126; Queued blocks: 3211; Resources: 0.0 CPU, 7.0 GPU, 66.1KB object store; [0/615 objects local]:   3%|██▍                                                                                | 112/3.83k [01:57<19:25, 3.19 row/s]

The actor only handles 1 task at a time, but 4 tasks are always assigned to an actor, making remaining 3 tasks “Waiting for scheduling”

Is there a way adjust such factor? I would like to adjust such factor 4 → 2 to stay tasks (blocks) queued.

I wanted to this because my (bad) implementation of the task takes from 2 min to 1 hour, depending on the row data. So I wanted to distribute tasks to the actors as evenly as possible.

Thank you!