Actor task fail running under Serve: is it normal to have this depth?

scorchio · April 30, 2026, 9:39am

1. Severity of the issue: (select one)
None: I’m just curious or want clarification.

2. Environment:

Ray version: 2.51.1
Python version: 3.10
OS: python:3.10-slim Docker image running on Ubuntu 24.04
Cloud/Infrastructure: GCP
Other libs/tools (if relevant): -

3. What happened vs. what you expected:

When a task fails in our system, I see a high depth number in the log message:

task_manager.cc:1321: Task failed: Type=ACTOR_TASK, Language=PYTHON, Resources: {}, function_descriptor={...}, task_id=..., task_name=..., job_id=01000000, num_args=6, num_returns=1, max_retries=0, depth=660407, attempt_number=0, actor_task_spec={actor_id=..., actor_caller_id=ffffffffffffffffbb5128cfbcad1549ce6532fa01000000, seq_no=5828, retry_exceptions=0}

As far as I can understand, we don’t have anything in our codebase that would explain a call chain this long. Is there anything around Ray Serve that would explain this?

RunLLM · April 30, 2026, 9:40am

It seems likely that the depth field in Ray task logs represents the call stack depth, incremented each time a task or actor submits another task/actor, starting from the driver at depth 0. A very high depth (e.g., 660,407) is unusual and typically indicates deeply nested or recursive task/actor submissions, but Ray Serve itself does not inherently create such deep chains unless your application logic does so (e.g., deployments calling each other in a loop or recursion) [common.proto, see int64 depth = 27;][task_spec.cc, see stream << ", depth=" << GetDepth()].

My understanding is that unless your code or a library you use is recursively submitting tasks/actors, this depth should not grow so large. There is no documentation or Ray Serve mechanism that would automatically cause such high depth values. Would you like a step-by-step breakdown of how depth is tracked or suggestions for debugging where this might be happening in your code?

Sources:

Hint: Mention @RunLLM in the post for followups.

Topic		Replies	Views
Check failed: worker->GetAssignedJobId().IsNil() Ray Serve	1	69	December 11, 2024
Serve deploy failing because ray actor died Ray Clusters	0	238	January 19, 2024
Help debugging blocked serve deployment Ray Serve	1	633	March 7, 2022
[Serve] Ray Serve, RayActorError: The actor died unexpectedly before finishing this task Ray Serve	1	1290	April 22, 2021
No Logging Options for Serve Deployment and/or Ray generated Pool Actors Ray Serve	2	496	February 23, 2022

Actor task fail running under Serve: is it normal to have this depth?

Related topics