Hi everyone!
I have stumbled upon an error and think that maybe it can be easily solved.
I am working on parallelising my code with @ray.remote decorator for functions.
I have 4 functions with decorators, and one of those (start_to_end) calls in itself all three functions in one after another with remote attribute.
Finally, I want to obtain parallelized results with this code
start = time.time()
arrays = ray.get([start_to_end.remote(l[0], l[1], 1000, 1) for l in paths_])
end = time.time()
print(end - start)
print(arrays)
I expect to obtain on print numpy arrays but I get objects refs. This is contrary to expected behaviour from tutorials:
[ObjectRef(5509701662a88bedffffffff0100000001000000),
ObjectRef(b8fede355f38ebfdffffffff0100000001000000),
ObjectRef(c69a52ad8b98212fffffffff0100000001000000),
ObjectRef(0a38bab12388e801ffffffff0100000001000000)]
Once I call once again ray.get(array), I indeed get a list with numpy array. So the problem stems from the fact that I the start_to_end function gets results from other remote functions, but itself it also has remote attribute. However, I spent quite some time figuring out the problem, so maybe this can be clearer indicated in help?
Maybe try:
arrays = [ray.get(start_to_end.remote(l[0], l[1], 1000, 1)) for l in paths_]
Peter
thanks for the idea, Peter.
Unforunately, the way you proposed once again gives objectrefs only.
In addition, I think that the way you proposed will not give parallelization, as we call consequently ray.get on elements from the list.
I tested paralelization with exaple:
import ray
import time
ray.init()
@ray.remote
def f(i):
time.sleep(1)
return ifutures = [print(ray.get(f.remote(i))) for i in range(20)]
and it worked fine for me, maybe it will be usefull.
Peter
How does your start_to_end
looks like? It probably returns a list of object ids
Indeed, start_to_end
returns a list of object ids. I just thought that maybe it could be useful to have to call ray.get()
only once to get all the objects instead of having to call it twice consequetively.
@ray.remote
def start_to_end(in_path, out_prefix, length, label, depth=2):
sequences = list(SeqIO.parse(in_path, "fasta"))
fragments = fragment.remote(sequences, length)
corrected_fragments = correct.remote(fragments)
encoded_fragments = \
ray.get(one_hot_encode.remote(corrected_fragments, label, depth))
return encoded_fragments```
the problem and workaround of nested functions are discussed here Ray starts too many workers (and may crash) when using nested remote functions. · Issue #3644 · ray-project/ray · GitHub