Nested functions force to call ray.get twice

Hi everyone!
I have stumbled upon an error and think that maybe it can be easily solved.
I am working on parallelising my code with @ray.remote decorator for functions.
I have 4 functions with decorators, and one of those (start_to_end) calls in itself all three functions in one after another with remote attribute.
Finally, I want to obtain parallelized results with this code
start = time.time()
arrays = ray.get([start_to_end.remote(l[0], l[1], 1000, 1) for l in paths_])
end = time.time()
print(end - start)
print(arrays)
I expect to obtain on print numpy arrays but I get objects refs. This is contrary to expected behaviour from tutorials:
[ObjectRef(5509701662a88bedffffffff0100000001000000),
ObjectRef(b8fede355f38ebfdffffffff0100000001000000),
ObjectRef(c69a52ad8b98212fffffffff0100000001000000),
ObjectRef(0a38bab12388e801ffffffff0100000001000000)]
Once I call once again ray.get(array), I indeed get a list with numpy array. So the problem stems from the fact that I the start_to_end function gets results from other remote functions, but itself it also has remote attribute. However, I spent quite some time figuring out the problem, so maybe this can be clearer indicated in help?

Maybe try:
arrays = [ray.get(start_to_end.remote(l[0], l[1], 1000, 1)) for l in paths_]

Peter

thanks for the idea, Peter.
Unforunately, the way you proposed once again gives objectrefs only.
In addition, I think that the way you proposed will not give parallelization, as we call consequently ray.get on elements from the list.

I tested paralelization with exaple:

import ray
import time

ray.init()
@ray.remote
def f(i):
time.sleep(1)
return i

futures = [print(ray.get(f.remote(i))) for i in range(20)]

and it worked fine for me, maybe it will be usefull.

Peter

How does your start_to_end looks like? It probably returns a list of object ids

Indeed, start_to_end returns a list of object ids. I just thought that maybe it could be useful to have to call ray.get() only once to get all the objects instead of having to call it twice consequetively.

    @ray.remote
    def start_to_end(in_path, out_prefix, length, label, depth=2):
        sequences = list(SeqIO.parse(in_path, "fasta"))
        fragments = fragment.remote(sequences, length)
        corrected_fragments = correct.remote(fragments)
        encoded_fragments = \
            ray.get(one_hot_encode.remote(corrected_fragments, label, depth))
        return encoded_fragments```

the problem and workaround of nested functions are discussed here Ray starts too many workers (and may crash) when using nested remote functions. · Issue #3644 · ray-project/ray · GitHub