Proper way to load environment data

How severe does this issue affect your experience of using Ray?

  • None: Just asking a question out of curiosity

Hello,

I am currently using rllib to train an agent using data read from a hdd file. My current workflow is this:

  1. Load the file (pkl file) at the environment init.
  2. On each reset I sample N points from the loaded file.
  3. On each step I iterate over this data sample until done.

This allows me to train the agent on previous data, instead of waiting for live data (that updates slowly).

On this context, Is this the correct way to load the data? (it consumes several GB of RAM). Or there exists a more optimized way? I have only basic knowledge about rllib, so any resource or reference on the howto would be appreciated.

Thank you for your time!