How to write file to worker's local storage in Kuberay

I’m using kuberay to do some large data processing job. I need to download some data and write some data as intermediate result. But when I use the following code. It seems all worker and head share same disk storage.

def compute(num: int) -> int:
    path = f'/tmp/test.txt'
    if os.path.exists(path):
        raise FileExistsError('File already exists')
    with open(path, encoding='utf-8', mode='w') as f:
        f.write('#' * 1000)
    return num**2

The ideal local storage for worker should be isolated and will be cleared up when the task completed


If I understand correctly, each worker node will have its own /tmp/path _file_name, so if more than one compute function is scheduled distributed on worker process on a node will see the same file. If you want each function instance to have a unique file to avoid this collusion, then
have the compute function create a tempfile, which will guranteed a unique file per instance of
a compute function on the same node.

cc: @architkulkarni @Kai-Hsun_Chen

I agree with Jules’s suggestion to use tempfile — Generate temporary files and directories — Python 3.12.0 documentation. I think if each Ray task were to have its own isolated storage, the overhead might cause issues at scale. Plus, there are use cases where you want all tasks to share the same local filesystem.