Hi I am using ray.put(large_2d_array)
to store a large numpy 2D boolean array and then in the worker process I take a column from this shared numpy array and pass it to a cython function and then create a flat buffer view via cdef cnp.npy_bool view = &sliced_array[0]
. With this view, I can modify the underlying buffer (setting some indices to True).
Since plasma store objects are immutable, what are the unknown problems I might face because of doing this?
I would be glad if someone can throw some light on this.
Thank you!
P.S:
Code template:
import numpy as np
large_2d_array = np.zeros((6000000000, 205), dtype=bool)
shared_array = ray.put(large_2d_array)
# Call worker via ray remote here and pass the shared_array
......
......
@ray.remote(num_cpus=1)
def worker(large_2d_array, col_idx):
array_slice = large_2d_array[:, col_idx]
cython_function(array_slice)
file: cython_func.pyx
def cython_function(cnp.ndarray[cnp.npy_bool, ndim=1] sliced_array):
cdef cnp.npy_bool view = &sliced_array[0]
cdef int i
for i in range(100):
view[i] = True
return