Offline API with Google Cloud Storage bucket

Hi everyone,

I am running my experiments in the cloud and that works fine. I use RLlib’s Offline API to write out variables from my custom environment and policy. That works also fine.

What I want to do now is to store these outputs from the Offline API not on the head node, but in a GCS bucket (which works for example for tune syncing). I saw in the documentation:

    # Specify where experiences should be saved:
    #  - None: don't save any experiences
    #  - "logdir" to save to the agent log dir
    #  - a path/URI to save to a custom output directory (e.g., "s3://bucket/")
    #  - a function that returns a rllib.offline.OutputWriter
    "output": None,
    # What sample batch columns to LZ4 compress in the output data.
    "output_compress_columns": ["obs", "new_obs"],
    # Max output file size before rolling over to a new file.
    "output_max_file_size": 64 * 1024 * 1024,

where it says also “e.g. s3://bucket/”. This let me thought it works also with GCS, but it does not. I do not even know where the output gets written to - at least it does not get written into the GCS bucket :grinning_face_with_smiling_eyes:

I checked several times the path to the bucket (which I copied) and ensured that this path is correct. When running tune.run() I get the following debug info:

DEBUG json_writer.py:77 -- Wrote 1477409 bytes to <_io.TextIOWrapper name='output/output-2021-10-28_02-22-24_worker-1_0.json' encoding='UTF-8'> in 0.08527207374572754s

My original path in the config was actually "output": "gs://output-from-train/output/". What happened here? Can anyone help? Is GCS support even implemented for the Offline API or do I fight windmills here?

Any help is welcome!

Opened an issue.