I try to download private gitlab and found that the downloaded content is empty.
rt = RuntimeEnv(
cause the url needs git-token, the path of /tmp/…/working_dir_files/ on head node has no content
And get “does not exist on the cluster. Something may have gone wrong while downloading or unpacking the working_dir.” after add token: https://git.selflab.co/factory/-/archive/master/factory-master.zip?private_token=xxxx
It seems like the zip path have already existed in head node. How can I delete it?
Internally we use
smart_open (smart-open · PyPI) to download zips. Can you try this script to see if it works?
with smart_open.open("https://git.selflab.co/factory/-/archive/master/factory-master.zip?private_token=xxxx", 'rb',transport_params=None) as zip_file:
Also would you mind sharing this log file:
Finally, I fixed it as you said. And I found two points that make me confuse.
- If I submit one job twice, which the second job git-URI has new git commit, how can I ensure that the second time is the latest
- How to clear the /tmp/…/working_resouces/myrepo which is the latest download from URI
Ray caches the zip file by URL and won’t always download the latest zip. I think if you always want the latest commit, you can fetch the commit hash in the driver script and use that hash in the url to keep it consistent. You may also be able to add your access token in the url like this: