The error aiohttp.web_exceptions.HTTPRequestEntityTooLarge: Request Entity Too Large occurs because Ray enforces a 100 MiB limit on the working directory or files uploaded with a job submission. If your working directory or files (including the CSV or generated PDFs) exceed this limit, the upload fails with this error. The recommended solution is to store large files (like your CSV or PDFs) in cloud object storage (e.g., S3) and access them from within your Ray job, rather than uploading them directly with the job submission. You can also use the excludes attribute in runtime_env to prevent large files from being uploaded with the job submission, or use a remote URI for the working directory if needed. See examples and discussion in the sources below.
Would you like more detail on how to restructure your workflow or configure Ray to avoid this error?
Sources:
- Discourse: aiohttp.web_exceptions.HTTPRequestEntityTooLarge
- Discourse: job working dir empty
- GitHub Issue: working dir upload size warning
- Anyscale Docs: Large dataset best practices
Hint: Mention @RunLLM in the post for followups.