Remote function not working as expected

Hi, I have just got started with ray so am a total noob. I have setup a cluster of two nodes manually on VMware workstation. Using ray 2.0.0.dev0 on python 3.6.4 on both nodes. I have setup the cluster

Now I run the following code on my child node:

ray.init(address='auto', _redis_password='5241590000000000')
df = pd.read_csv('')

@ray.remote
def f(df):
    # Make a copy on remote nodes
    df.export_csv("~/test_ray.csv")
    # Do some calculations and return df
    return modified_df

# Serialize the dataframe with pyarrow and store it in shared memory.
df_id = ray.put(df)

result_ids = [f.remote(df)]
print(result_ids)

# Get the results.
results = ray.get(result_ids)
print(results)

What I am expecting is to create file on both the nodes. This code only executes on the child node. By using ray.client(“192.168.169.129:10001”).connect(), the code executes only on the parent node and not the child node. Any help would be much appreciated.

It looks like you’re only invoking f.remote() once here, so it will only be scheduled and run on a single node. In general, one way to run a remote function on each node in your cluster would be to use Placement Groups — Ray v2.0.0.dev0 using the strategy "SPREAD" or "STRICT_SPREAD". If you invoke f.remote() multiple times and don’t use Placement Groups, Ray will automatically decide which nodes to schedule the tasks on based on the available CPU resources.

If you simply want to make a certain file available on remote nodes on your cluster, you can try using Runtime Environments Advanced Usage — Ray v2.0.0.dev0, which is a new feature (feedback welcome!). Specifically, if "/code/my_project" is a directory on your laptop where test_ray.csv resides, then connecting with ray.client(“192.168.169.129:10001”).env({"working_dir": "/code/my_project"}).connect() will automatically upload the directory to the cluster and change the current working directory of the remote workers to that directory, so that the body of your remote function can be

def f(df):
    # Make a copy on remote nodes
    df.export_csv("test_ray.csv") # Relative path -- will be interpreted relative to working_dir
    # Do some calculations and return df
    return modified_df

cc @yic