Hello there Ray community,

I just found out this nice framework and I would like to use it to parallelize some computation that by default runs on a single CPU in my python implementation.

I have followed a few examples available on internet and my “code” looks something like this:

```
def newton_root(xi,eta,h,H):
return newton(xinp1_eval,x0=xi,fprime=xinp1_grad_eval,args=(eta,xi,h,H))
```

This is the function that I am trying to parallelize. I am just trying to solve a root finding problem for a set of points in a 2D numpy array. In this case, `xi`

, `eta`

are numpy arrays and `H`

is a `RectBivariateSpline`

object from scipy.

I then try to use the remote function by calling it through another function:

```
def root_pq_par(self):
xi_g = ray.put(self.xi.flatten())
eta_g = ray.put(self.eta.flatten())
h_g = ray.put(self.h*np.ones(self.xi.flatten().shape))
# H_g = ray.put(repeat(self.H,self.xi.flatten().shape[0]))
H_g = ray.put(self.H)
result = ray.get([newton_root.remote(xi_g,eta_g,h_g,H_g)])
self.xi_np1 = np.copy(result)
```

So in this case, I first put the numpy arrays and the spline function in the shared memory space.

I then try to obtain the results at the end with `ray.get`

The situation that I am having is that when testing and using this code, I observe that I am using a single CPU to perform the computations (in windows task manager I can see that many instances of the python interpreter are available but all of them have 0% of CPU usage and only one has a really low usage). In fact, if I compare the speed of this approach to the speed obtained from a normal python execution I observe no difference at all.

What could I be missing here?

Thanks for the feedback!