Hi Core team,
In the process of answering a SO question, I found that Ray has unexplained memory usage when passing a 10MB payload (in the object store) and a Python object instance which references another module. See the following measurements:
+--------------+-----------------+-----------+---------+------------------+
| with_payload | with_config_obj | num_tasks | used_mb | used_mb_per_task |
+--------------+-----------------+-----------+---------+------------------+
| True | True | 1 | 28.47 | 28.47 |
| True | True | 8 | 209.51 | 26.19 |
| True | True | 16 | 419.36 | 26.21 |
| False | True | 1 | 18.27 | 18.27 |
| False | True | 8 | 130.23 | 16.28 |
| False | True | 16 | 256.55 | 16.03 |
| True | False | 1 | 3.01 | 3.01 |
| True | False | 8 | 14.65 | 1.83 |
| True | False | 16 | 29.07 | 1.82 |
| False | False | 1 | 0.52 | 0.52 |
| False | False | 8 | 0.52 | 0.07 |
| False | False | 16 | 2.82 | 0.18 |
+--------------+-----------------+-----------+---------+------------------+
Why is used_mb_per_task
~10MB higher when both with_payload
and with_config_obj
are True than when with_payload
is False and with_config_obj
is True?
Details
Memory measurement
I use used
from psutil.virtual_memory()
. The same pattern exists for available
and free
, so I don’t think I’m double-counting the shared memory usage.
Config object/payload
The config object and payload are ray.put()
'd after they are defined in this way:
class DummyObject:
def do_something(self):
print(nltk.__version__)
def create_data(target_size_mb):
bytes_per_mb = 1 << 20
payload = 'a' * target_size_mb * bytes_per_mb
return payload
@ray.remote
def dummy_fun(config, payload):
return type(config), type(payload)
The full code is here. You can generate the table above by running the driver.py script. I obtained those results with Ray 1.13 on a c5.4xlarge instance on AWS (16CPU).