Usage of Disk Space grows due to object spilling

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I am using ray to perform parallel operations as given in below code. I found out that it is continuously spilling the objects to local filesystem, thus filling up the hard drive quickly and affecting the other processes as well.

Below is code,

def run(self, filename: str, response: dict, filenames, shape:tuple):

    # pylint: disable=no-member
    array = ray.get([
        self._output.remote(self, # pylint: disable=no-member
                                  response[i],
                                  filenames[i]) for i in range(len(filenames))])
    array = {k: v for d in array for k, v in d.items()}

    cv2.imwrite(result_path, array)

    # release array from ray's ObjectStore
    del array

@ray.remote
def _output(self, response, filenames):

    all_array = {}

    # Bytes to numpy.
    array = np.frombuffer(response[0], dtype=np.float32)
    array = np.reshape(array, response[1])

    for value, filename in zip(array, filenames):
        value = cv2.bitwise_not(value)
        value[value < 0.0] = 0.0
        value = self.image_add(value)
        value = cv2.bitwise_not(value)

        all_array[filename] = value
    return all_array 

I am using ray 2.0.0.

All answers are appreciated. Thank you.

    array = ray.get([
        self._output.remote(self, # pylint: disable=no-member
                                  response[i],
                                  filenames[i]) for i in range(len(filenames))])
    array = {k: v for d in array for k, v in d.items()}

Maybe instead of invoking every _output and append it to array, you can batch this? E.g., invoke N _output remote method, write it to cv2.imwrite and repeat? (in this case, you can avoid having lots of objects used at the same time, meaning you can reduce spilling)

For example; (Note code is not very accurate, but you can see the idea)


def batch_write(N, filenames):
    array = ray.get([
        self._output.remote(self, # pylint: disable=no-member
                                  response[i],
                                  filenames[i]) for i in range(N)])
    array = {k: v for d in array for k, v in d.items()}

    cv2.imwrite(result_path, array)

    # release array from ray's ObjectStore
    del array

batches = len(filenames) / N
for i in batches:
    batch_write(N, filenames[i*N : (i+1) * N])

okie, Thank you for the suggestion. I will try.

(in this case, you can avoid having lots of objects used at the same time, meaning you can reduce spilling)
→ If we reduce the spilling, then also It will be problematic. Since it continuously accumulates over time.

Thank you once again for the suggestion.

I made the code modifications (using batches) as suggested, but still seeing the spilling.

Also attaching the ray memory output for the reference.

You need to run the ray memory command when spilling is happening. I think you ran after it is completed (so I cannot see the reference information).

If you reduce the batch size, does it work? E.g., if you set batch as 1, it should never spill. If it still does, maybe there’s a mistake in code