How to access Amazon S3

Gil_Vernik · January 6, 2022, 8:10am

I posted this in Ray Data, but seems it less active than the core forum. Sorry for multiple posts.

There are various examples how Ray can read and write data from Amazon S3, for example

ds = ray.data.read_binary_files("s3://bucket/image-dir")

How to configure Ray with S3 credentials? I don’t run Ray in AWS, I run it locally on my laptop (just installed it with pip ) and I want to read data from my Amazon S3 and also write there.

Thanks

Clark_Zinzow · January 7, 2022, 12:10am

Hi @Gil_Vernik! If you set your AWS credentials via the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables, Datasets should use those credentials without any code changes.

If this environment variable method isn’t agreeable, you can pass ray.data.read_binary_files() an Arrow S3FileSystem instance containing your AWS credentials (see the .read_binary_files() API).

HurleyWu · February 10, 2022, 3:33am

Hi, @Clark_Zinzow how to set all environment variables when I use MinIO locally, since I don’t like to pass Arrrow S3FileSystem instance to .read_binary_files API.

Amit_Gelber · June 19, 2023, 12:42pm

full example:



fs = s3fs.S3FileSystem(
    anon=False,
    use_ssl=False,
    client_kwargs={
        "aws_access_key_id": 'key',
        "aws_secret_access_key": 'key',
        "endpoint_url": 'endpoint',
        "verify": False})

ds = ray.data.read_parquet(filesystem=fs,
                           paths="s3://....parquet",
                           )

Topic		Replies	Views
How to configure Ray to access Amazon S3 Ray Data	4	2212	January 7, 2022
Downloading working directory from private S3 storage Ray Core	5	196	February 5, 2025
Unable to locate credentials with S3 Remote URIs Ray Core	3	517	March 1, 2024
Ray Train on EKS unable to use Pod Identity to access Storage Ray Train	3	77	March 4, 2025
Access aws s3 for vllm v0.9+ Ray Data	2	60	July 10, 2025

How to access Amazon S3

Related topics