NUMPY database?

PS>> I just started reading about HDF5, which seems to answer most of the questions …

How do I create a LARGE 2D numpy array that has the following specs :

1. Can do DOT product i.e.,ary2d)
2. CAN use ary2d[rows,cols] syntax to update values
3. CAN resize the array
4. CAN access it by multiple Actors/tasks

My idea so far is ::

To have some sort of Server/daemon app that forks multiple processes.
Split the array to chunks, so that resizing the array is simply adding a new chunk.
Then applying a DOT product for example is iteratively applying it to every chunk and combining the result.

I couldnt find a way in Ray datasets or Apache arrow docs how to UPDATE the numpy array f.e. chunk3[45,:] = vec

Does RAY handle locking of the access to the chunks or i have to do it manually ?

Should I use something like HDF5 instead of Arrow (which has array not np.array … need the and cython) ?

Sorry for the multi directional questions, if I have to put succinctly what I need is a NUMPY DATABASE.

All projects I’VE checked so far dask,vaex,pytables,arrow and possibly ray dataset seem to be NON-UPDATABLE, NON-RESIZABLE and SINGLE-CLIENT ACCESS projects.

If you can comment on any of the topics with : example, link to docs to read.

I’ve read most of Ray core and Dataset docs and have done some non trivial experiments but the bottleneck is the serial access to a numpy array (and a python dict)
The multi-Actor app was ~3 times slower.

My hope is by chunking the array to allow multi-access and implement resizing (i use np.append() currently)

cc: @Clark_Zinzow @jianxiao