Launch jobs via runtime_env with large dependencies

nostalgicimp · February 9, 2022, 4:33am

What is the best practice to launch jobs with really large dependencies, e.g large binaries with many dynamic libs that may be larger than 500MB? How do you compare the 3 options below

submit via runtime_env
Let Ray actors/workers individually retrieve those binaries/.so from shared storage, e.g. S3
Build those binary/.so together with Ray into image that run on each pod.

Thanks

sangcho · February 9, 2022, 12:28pm

I think runtime env doesn’t support data that’s bigger than 100MB now (cc @architkulkarni for. confirmation), so I feel like 1 is not the viable option. I think 2 & 3 should be similarly a good option? The third one might be heavyweight, but you won’t have additional overhead waiting for downloading & loading shared objects.

architkulkarni · February 9, 2022, 5:54pm

That’s right, for uploading directly from your local machine to the Ray cluster, the limit is 100 MB, but I agree that 2 and 3 are good options and the way Sang compared them makes sense. Here’s some documentation for 2: Handling Dependencies — Ray v1.10.0

ec777 · September 18, 2023, 9:45pm

Hello there! Just checking to confirm our understanding that if an s3 uri is provided, then there is no limit to zip file size, specifically example.zip can be any arbitrary size:

runtime_env = {…, “working_dir”: “s3://example_bucket/example.zip”, …}

Thank you.

cade · September 20, 2023, 8:55pm

@ec777 yes, Ray does not enforce a size limit on remote URIs. This is because the file contents are not stored in GCS.

ec777 · September 20, 2023, 9:39pm

Thanks for the confirmation @cade

Topic		Replies	Views
How to increase package size bigger than 250.00MiB for workers? Ray Core	3	891	January 17, 2023
aiohttp.web_exceptions.HTTPRequestEntityTooLarge Ray Client	4	1884	January 5, 2024
Requirements.txt support with Job API Ray Clusters	7	1002	June 28, 2022
Ray submit with multiple python files	1	607	March 23, 2021
Ray cluster connection dies after passing large array for calculation Ray Core	7	670	March 29, 2021

Launch jobs via runtime_env with large dependencies

Related topics