@ray.remote function seemingly copying data from plasma store

Robert_Speare · March 17, 2021, 4:12pm

Hello,

I am currently trying to reduce the memory of a program that takes one large numpy array and one large dictionary of pandas dataframes, and iterates over their values using different model parameters to create some score/result. I would like to share these two objects across processes without copying. After configuring my model function to be remote, and passing references to the large numpy/pandas objects and running things with the mprof memory profiler, I am finding that memory overhead for the program is the same as running with ProcessPool, where these objects are piped/copied to each process. What am I missing to get to zero-copy? (Possibly related thread: How to share memory with non-numpy object?)

The pattern looks like this:

@ray.remote
def run_model(array, dict_of_pandas_df, params):
     ... do some work ...
     return result

numpy_array_ref = ray.put(my_array)
dict_of_pandas_df_ref = ray.put(dict_of_pandas_df)
list_of_model_parameters = [ {}, {}...]
result_refs = [run_model.remote(numpy_array_ref, dict_of_pandas_df_ref, x) for x in list_of_model_parameters]

results = ray.get(result_refs)

sangcho · March 17, 2021, 5:41pm

When you do zero-copy to your process, note that the process still says it uses the memory (from the shared memory). It means your memory is double counted. For example, if you have 2 processes A and B, each of which uses 100MB of shared memory, both of them will say they use 100MB of memory.

cc @suquark do you know any good way to verify the zero copy read?

Robert_Speare · March 17, 2021, 7:03pm

Thanks a lot @sangcho . If it’s useful, I am profiling with the following mprofile command:

mprof run --include-children python my_script.py

Even when looking at the running processes in top/htop, I believe that the RES portion of memory is significantly larger than SHR – both when running multiprocessing as well as with @ray.remote (will follow up to verify on this)

sangcho · March 17, 2021, 11:20pm

Hmm I see. What’s the dtype of your numpy. Are they float or integer or strings?

Robert_Speare · March 18, 2021, 12:12am

I am passing in a numpy array with dtype float. The pandas dataframes – values of the dict – have mixed types: float/pandas.Timestamp.

sangcho · March 18, 2021, 12:33am

Hmm that’s pretty weird. Pandas dataframe uses the numpy array under the hood afaik, and float type numpy should be zero-copy read. So each process only should copy parts that are not numpy array which means your SHR should be larger than RES)!

Can you actually try sth like this?

@ray.remote
def your_func(array_ref_list):
    # Measure the overhead of ray.get
    s = time.perf_counter()
    ray.get(array_ref_list)
    print(time.perf_counter() - s)

numpy_array_ref = ray.put(my_array)
# NOTE: Pass the list of object ref so that it won't be automatically obtained.
[your_func([numpy_array_ref]) for _ in range(10)]

sangcho · March 18, 2021, 12:33am

If the overhead is as big as copying large array into the process, that means zero copy read wasn’t working as expected.

sangcho · March 18, 2021, 12:38am

Actually there’s also a possibility pandas.Timestamp. is not zero-copyable.

Robert_Speare · March 18, 2021, 2:07am

Awesome @sangcho. I’ll try what you’ve put forward – might take 24 hours or so to get back on this thread. Thanks!

Robert_Speare · March 24, 2021, 1:10am

Hey @sangcho. After looking into things, the latency of a ray.get with a function is not very large – only a few seconds – for an object of ~50 GB. Need to do some more research, but hoping to post back on this thread with a fully reproducible example if this rears its head again – thank you so much for your help!

sangcho · March 27, 2021, 10:31pm

Sounds good! I still have suspicion that the issue is you have Timestamp dtype btw (afaik, we only support zero-copy read for integer & float).

Topic		Replies	Views
[Core] How to share memory with non-numpy object? Ray Core	6	1408	May 7, 2021
How many copies are occurred when getting an object from Plasma Ray Core	9	649	November 2, 2021
Shared numpy array Ray Core	1	1273	July 19, 2022
Zero-copy deserialization with recursive dictionaries/lists Ray Core	1	688	August 3, 2021
Memory sharing with nested list Ray Core	9	627	June 30, 2021

@ray.remote function seemingly copying data from plasma store

Related topics