Not able to batch processes that includes S3 operations

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I have a pipeline in which files are downloaded from a remote server and processed on an AWS cluster to derive the output. The configuration of the cluster is 4 Cores, 32G Ram. The whole system runs in a batch where 3 workflows run parallelly.

Use Case

Workflows include: downloading 2 files from the remote server. Processing them. store it on s3.

The whole pipeline seems to work fine for batch size 3. Meaning 6 files can be downloaded parallelly, processed parallelly, and stored on s3.

The problem starts when we increase the batch size if we go to 4. it throws RaySystemError. The workflow failed during execution and it says

S3 subsystem not initialized;

I also want to draw your attention that, the workflows storage path is of an S3 bucket.


Error Tail Logs

  File "pyarrow/_s3fs.pyx", line 214, in pyarrow._s3fs.S3FileSystem._reconstruct
  File "pyarrow/_s3fs.pyx", line 204, in pyarrow._s3fs.S3FileSystem.__init__
  File "pyarrow/error.pxi", line 141, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 97, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: S3 subsystem not initialized; please call InitializeS3() before carrying out any S3-related operation

Although I can work around this by setting up the batch size limited to 3, but I am looking for some concrete explanation why it worked for batch size 3 and not for more than that.
It might help me fix the issue.

Thanks in Advance.