Memory (RAM) not being released by Ray

  • High: It blocks me to complete my task.

In my project pipeline, I have two actors, CamActor (producer) and StreamActor (consumer). CamActor reads a camera streaming and put frame (python class instance having RGB frame and meta information) into frameHolder (Queue). StreamActor reads frames asynchronously from frameHolder and generate encoded streaming .
Camera streaming is at 20 fps and StreamActor can work on 45 fps.

For every new job, pipeline creates new CamActor and frameHolder (queue), StreamActor is common for each job.

When I start two parallel job, it works perfectly, as StreamActor can handle 40 fps (two camera frameholder are added data with 20 fps). Now when I start third job, there will be some frames left in each queue, as StreamActor can only handle 45 fps but input is 60 fps (three camera frameholder are added data with 20 fps). So, every second 15 frames will be left in queue, which will cause to memory increment.

Now, main issue here is as follows:
If I start two jobs, some memory will be occupied and when I stop it after sometime, memory will be released.
But when I start three jobs, some memory will be occupied at start and memory consumption will increase gradually (due to extra frames in frameholder). Now, if I stop any/all the jobs, ray is not releasing memory which has been occupied by this extra frames.

Things tested:

  • tried to clear queue (read complete data and delete it) before shutdown.
  • added gc.collect()

Note: I can’t use ray.shutdown() here at end of one job as there are parallel jobs are running which is sharing same StreamActor.

Hi @shyampatel,

How is your Queue implemented, is it an actor? How do you stop your jobs? Could you try ray memory and see what it tells you?

Thanks for quick response @jjyao .

I am using ray internal queue, from ray.util.queue import Queue.

My stop logic is as following:

  • For each job, I have assigning unique job_id.
  • All the actors are detached.
  • Each actor has alive flag, based on that it will come out of continuous while loop. Each actor has start_job and stop_job functionality. stop_job will disable the flag, remove extra data from queue and shutdown the queue.
  • I am giving name of CamActor based on job_id.
  • StreamActor is defined with fixed name. For each job, same actor will be used, we will call asynchronous start_job method and pass frame_holder as argument.
  • while stop call, we will pass job_id which we want to stop. It will get both the actors and call stop_job.

During the jobs, ray memory returned following:

192.168.10.101    16177  Worker  disabled                -               519485.0 B  PINNED_IN_MEMORY    9c40ab3d946b66c56dd9193d64d6aeab067f04250300000001000000
192.168.10.101    17942  Worker  disabled                -               519686.0 B  PINNED_IN_MEMORY    17c06bbeb8257d3a1bcff6cb31db22b0ff54384d0400000001000000
192.168.10.101    16187  Worker  disabled                -               519689.0 B  PINNED_IN_MEMORY    b5211e7a44521434a5d83baccfaa3db42d9c4fa60200000001000000
192.168.10.101    17810  Worker  disabled                -               520267.0 B  PINNED_IN_MEMORY    3bf4fc1c506592416740a95795520bd9356963ec0400000001000000
192.168.10.101    17026  Worker  disabled                -               520270.0 B  PINNED_IN_MEMORY    e0ec20487db2284444c9fb3df8db6f7c15395f200300000001000000
                                        .................... and few more lines like above
--- Aggregate object store stats across all nodes ---
Plasma memory usage 1833 MiB, 3704 objects, 40.26% full, 0.04% needed
Objects consumed by Ray tasks: 40829 MiB.

After ending all jobs, ray memory returned following:

======== Object references status: 2022-08-17 09:45:42.952123 ========
Grouping by node address...        Sorting by object size...        Display allentries per group...


To record callsite information for each ObjectRef created, set env variable RAY_record_ref_creation_sites=1

--- Aggregate object store stats across all nodes ---
Plasma memory usage 0 MiB, 6 objects, 0.0% full, 0.0% needed
Spilled 15 MiB, 31 objects, avg write throughput 11 MiB/s
Objects consumed by Ray tasks: 56454 MiB.

I am using custom resources for each actor. After stop any of job, actors are killed perfectly and releasing cpu and custom resources. In dashboard also, after stop job, actors for that job will be DEAD.

When you stop all jobs, which process is holding the memory that you think should be released? Is it object store memory or heap memory? Could you check Queue.empty() to make sure it’s actually empty?

Also for the detached actors, do you manually kill it since it won’t be automatically GCed (Terminating Actors — Ray 1.13.0)?

When I stop job, each actor is killed successfully. After stopping all jobs, when I check in dashboard, object store memory and heap memory will come to it’s start point.
Before job start:


After stopping all job:

But when I check ram usage using htop, I found that some memory is still occupied.
Before job start:


After stopping all job:

Yes, using Queue.empty() I have checked at the end that queue is empty.

For detached actors, in stop_job I am manually killing it.

Seems after all jobs stopped, Ray dashboard and htop have different view of how much memory is still occupied. Could you check what other command says like free?

Before job start:
ray dashboard:


htop:

free

After stopping all job:
ray dashboard:


htop:

free:

Note:
we are running cluster using cluster.yaml file. So when we make the cluster down using ray down cluster.yaml command, it is releasing this extra occupied memory.

Hi @shyampatel,

Thanks for sharing those screenshots. Seems there is around 1.8G that’s not freed. For the next step, could you share the memory usage of Ray processes (e.g. raylet, dashboard process) before and after. I’m trying to figure out which process is not releasing the memory. Also could you run df -BK | grep tmpfs before and after?

My current thinking is that when you stop all jobs, ray system processes are still running. Specifically raylet is running and it still holds the object store memory (even though there is no objects since we preallocate it)

Thanks for your continuous support and suggestions @jjyao .

I have attached all the required information to debug the issue:

Before job start:
Ray Dashboard:


htop with ray processes:

free & df -BK | grep tmpfs

ray memory:

After job start:
Ray Dashboard:


htop with ray processes:

free & df -BK | grep tmpfs

ray memory:

Thanks.

Could you check /tmp/ray/session_latest/logs/raylet.out, in the first few lines, you should see something like [2022-08-21 22:35:16,448 I 52466 1147664] (raylet) store_runner.cc:48: Starting object store with directory /tmp, fallback /tmp/ray, and huge page support disabled. Basically I want to see the directory of the object store and can you show the before and after disk usage of that directory?

Find the first few lines of /tmp/ray/session_latest/logs/raylet.out:

[2022-08-23 09:55:16,261 I 238744 238744] (raylet) io_service_pool.cc:35: IOServicePool is running with 1 io_service.
[2022-08-23 09:55:16,262 I 238744 238744] (raylet) store_runner.cc:32: Allowing the Plasma store to use up to 4.60466GB of memory.
[2022-08-23 09:55:16,262 I 238744 238744] (raylet) store_runner.cc:48: Starting object store with directory /dev/shm, fallback /tmp/ray, and huge page support disabled
[2022-08-23 09:55:16,262 I 238744 238779] (raylet) dlmalloc.cc:154: create_and_mmap_buffer(4604690440, /dev/shm/plasmaXXXXXX)
[2022-08-23 09:55:16,262 I 238744 238779] (raylet) store.cc:546: ========== Plasma store: =================
Current usage: 0 / 4.60466 GB

Following are some required debug information:
Before job start:

After job start:

Note:

  • I have analyzed disk usage of object store directory (/dev/shm) during the job and it was 0 (Zero) for complete job.

@jjyao
Can you please analyze above information and give some insights about object store memory? And how can I be sure that it’s been clean at the end of job? How can I manually clean it?

Sorry for the late reply. It turns out that du -hd 0 /dev/shm doesn’t show the correct usage of object store memory. Could you run df -h | grep tmpfs before and after?

Before job start:


After job start:


Note:

  • Here /dev/shm storage usage is increasing.

Yea, that makes sense. So object store will preallocate/reserve a fixed amount of memory during the lifetime of the cluster and it won’t be freed even there is no object until the cluster is shut down. By default, we won’t pre-populate the object store memory which means the physical memory will only be allocated by OS as we put objects to the object store, but once it’s allocated, it will always be there.

In summary, I think the memory not being released is occupied by the object store and it’s expected since object store will keep that memory for future objects.

Thanks for insights. @jjyao

Is there any way, we can reset the object store memory without ray shutdown?

Actually, there are multiple pipelines are running in system (some with ray and some without ray). So we can’t allow a single pipeline to occupy extra space when no job is going on.

Hi @shyampatel,

Currently there is no way you can reset the object store memory without ray shutdown. Although you can specify how much memory object store should reserve via --object-store-memory option.