GPU memory management

YarShev · October 22, 2021, 12:27pm

Hi guys,

How is GPU memory managed by Ray? Here is how I see the picture. An example:

import pandas
import ray

ray.init()

df = pandas.DataFrame()
o_ref = ray.put(df)

@ray.remote(num_gpus=1)
def foo(ref):
    # processing with ref

foo.remote()

When ray.put(df) is called, the data is put into plasma. When calling foo the data is transferred to GPU memory to do some processing.

The question is: is the data left on GPU memory alive or cleaned up from there right away?

It would be great if there is some docs describing GPU memory management in Ray.

Thanks in advance!

YarShev · November 2, 2021, 7:46am

@sangcho , any thoughts?

sangcho · November 3, 2021, 2:44pm

Have you read this part of the doc? GPU Support — Ray v2.0.0.dev0

Is it helpful? I believe it is related

YarShev · November 8, 2021, 8:53am

@sangcho , thank you for pointing out to the docs! It is very helpful. As I understand, either a task itself should manage/clean up GPU memory or we can specify max_calls=1 to directly release the GPU resources, right?

sangcho · November 10, 2021, 8:25am

@YarShev Yes that seems like how this works now! IIUC, Ray itself doesn’t have special handling in GPU memory usage!

Topic		Replies	Views
Intentionally not using GPU Ray Core	3	397	February 9, 2022
How to keep the data in GPU memory after remote call? Ray Core	1	367	March 18, 2021
Specifying extra resources for functions (tasks) running inside an Actor? Ray Core	2	351	September 27, 2023
Pipeline with no ray.get and a memory leak Ray Core	5	740	April 8, 2021
Proper workflow for keeping Ray memory clean and separating returned python objects from their Ray references Ray Core	6	3044	May 11, 2022

GPU memory management

Related topics