How to know the detailed memory usage of an Actor?

1. Severity of the issue: (select one)
Medium: Significantly affects my productivity but can find a workaround.

2. Environment:

  • Ray version: 2.47.1
  • Python version: 3.11
  • OS: Ubuntu 22.04
  • Cloud/Infrastructure: 5070Ti + 64G + AMD64
  • Other libs/tools (if relevant): None

3. What happened vs. what you expected:

  • Expected: I’m new to Ray and would like to know how to freeze a Ray local cluster before killing all actors and tasks due to out-of-memory, and how to drill down to a specific actor for memory analysis and get detailed object information in Plasma/Heap.
  • Actual: The job exits directly, leaving only the Actor-level memory overflow log report

First of all, I would like to express my gratitude to friends who are willing to share their experience

Traceback (most recent call last):
  File "/home/inbreeze/PycharmProjects/DataPlatformOnRay/python/raypipe/runner.py", line 65, in <module>
    logger.info(f"[DataPlatformOnRay] final return: {ray.get(mergeActor.run.remote())}")
                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ray-dev/lib/python3.11/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ray-dev/lib/python3.11/site-packages/ray/_private/client_mode_hook.py", line 104, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ray-dev/lib/python3.11/site-packages/ray/_private/worker.py", line 2849, in get
    values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ray-dev/lib/python3.11/site-packages/ray/_private/worker.py", line 939, in get_objects
    raise value
ray.exceptions.OutOfMemoryError: Task was killed due to the node running low on memory.
Memory on the node (IP: 10.2.97.17, ID: 89f4f4b4c16742cd9053e68d1513923883441cf068bfc2667e81c669) where the task (actor ID: 0f28f36dc1a20a125cb8a86f05000000, name=ResultMergeActor.__init__, pid=268816, memory used=0.06GB) was running was 61.76GB / 62.62GB (0.98628), which exceeds the memory usage threshold of 0.95. Ray killed this worker (ID: 5d4a0cc4958f36a555306298c10f9043ec52bfd14067810dbc5f4a9d) because it was the most recently scheduled task; to see more information about memory usage on this node, use `ray logs raylet.out -ip 10.2.97.17`. To see the logs of the worker, use `ray logs worker-5d4a0cc4958f36a555306298c10f9043ec52bfd14067810dbc5f4a9d*out -ip 10.2.97.17. Top 10 memory users:
PID	MEM(GB)	COMMAND
263393	14.02	ray::SceneCutRouter.submit
268660	11.42	ray::AesActor.process_with_queue
268785	11.22	ray::OptFlowActor.process_with_queue
263388	6.15	ray::SceneCutMemActor
268686	4.95	ray::OCRActor
54419	3.07	/home/inbreeze/.cache/JetBrains/RemoteDev/dist/461d91da9a280_pycharm-2025.2.0.1/bin/remote-dev-serve...
263297	0.47	/opt/miniconda3/envs/ray-dev/bin/python /home/inbreeze/PycharmProjects/DataPlatformOnRay/python/rayp...
268778	0.41	ray::LaplacianActor.process_with_queue
263394	0.29	ray::SceneCutMemActor
70074	0.28	/home/inbreeze/.cache/JetBrains/PyCharm2025.2/full-line/models/7b957b12-0866-32e2-985d-1542c7c2aeee/...

@INBreeze I suggest using a memory profiling tool like memray for this. Relevant docs are here: Debugging Memory Issues — Ray 2.49.2

To memory profile Ray tasks or actors, use the memray library. Install memray (pip install memray), then wrap your Ray task or actor code with memray’s Tracker context manager, saving the output to /tmp/ray/session_latest/logs so you can download the profiling files from the Ray dashboard. After running, use memray flamegraph <memory profiling bin file> to visualize the results. Example code and detailed steps are provided in the official documentation.

Would you like a step-by-step code example for profiling a Ray task or actor?

Sources:

Hint: Mention @RunLLM in the post for followups.