Suppose I’ve a cluster of 4 nodes, and the following simple code:
@ray.remote
def do_some_work():
time.sleep(random.uniform(0, 4)) # Replace this with work you need to do.
results = ray.get([do_some_work.remote() for x in range(4)])
How can I ensure, that each new ray remote call will go to a different node? If node is unavailable at the moment (no resources or still running previous ray task), execution will wait until node becomes available…