[Core] How to reslove RayOutOfMemoryError in python for ray package?

Dravid · April 24, 2021, 4:58am

I am getting below error when heavy operation runs in the application.

Error:

ray.memory_monitor.RayOutOfMemoryError

It’s taking 3.81/4.0 GB memory . Can anyone help me on this how to resolve this issue.

Is it required to increase the memory? if yes how to do that?

Below is initializer for ray in python application

import ray
ray.init(ignore_reinit_error=True)
ray.cluster_resources()["CPU"]

Can anyone guide me how to resolve RayOutOfMemoryError issue?

StackOverflow link

sangcho · April 24, 2021, 5:38am

HI, @Dravid Thanks for asking the question! Firstly, I’d recommend you to take a look at the memory management section of the Ray document Memory Management — Ray v2.0.0.dev0.

The error happens when you use more than 95% of memory of your machine. It is usually not recommended to use that high memory because it can cause many unexpected bugs.

There could be usually 2 reasons of high memory usage.

You use high object store memory of ray. You can check this from Ray dashboard (go to localhost:8265). In this case, there’s a possibility some of your objects are not GC’ed from the object store because you have references to objects. You can use ray memory command to debug this scenario.
Another possibility is that you literally just use high memory in your Ray application. To figure this out, you can check out process’ memory usage from processes named as ray:: from htop command and observe the memory usage of each process.

Dravid · April 24, 2021, 5:43am

@sangcho
Thank you for your quick reply.
We are using flask rest api and modin.pandas to serve learge amount of data. In this case Can we use ray.shutdown() function to reset the ray memory before sending the response?

sangcho · April 26, 2021, 4:50pm

Hmm that could work, but it is not a common pattern to call ray.shutdown for every request, and it is not recommended (the standard practice is to call ray.init(address=‘auto’) for the Flask process).

Dravid · April 27, 2021, 3:44am

@sangcho could you please explain how ray.init(address=‘auto’) will help in this case ?

sangcho · April 29, 2021, 6:14am

Usually, it is the most common pattern to run ray.init(address=‘auto’) when your process (flask server) starts. I am not saying this as a solution of your question, but I am saying the solution you suggested is not a common pattern. Running ray.init(address='auto') means you are creating a new job in the cluster, and ray.shutdown() means you are killing that job. So creating and killing job for every request could cause some unexpected problems (especially because it is a uncommon pattern).

Topic		Replies	Views
Memory issue when running ray.init() Ray Core	1	477	March 23, 2022
RayOutOfMemoryError: More than 95% of the memory is used Ray Core	6	4858	September 9, 2022
Ray Out of Memory Issue Ray Tune	1	201	April 30, 2024
Error occurs under high memory use Ray Core	3	761	July 25, 2022
Idle workers not releasing resources causing memory error Ray Tune	5	1827	April 27, 2023

[Core] How to reslove RayOutOfMemoryError in python for ray package?

Related topics