Change or Generate offline data

evo11x · July 3, 2022, 9:29pm

I want to modify the offline data with my actions, but I see that the observation from json is encoded (in what format?), also in the json file are many actions and previous rewards, do I need all those if I want to modify or generate new offline data?

I want to create new offline data with one action for each observation, is that possible?

It will be much easier if the action can be changed on the fly in the simulation with offline data.

High: It blocks me to complete my task.

arturn · July 5, 2022, 11:07am

The columns of your offline data depend on the algorithm you are using. But at the time of writing, postprocessing of data is done in the sample collection stage of our data flow. That means that if your RolloutWorkers sample from the environment, they will also call the postprocess_trajectory() method of the policy that is used for collection. So upon collection of experiences from RolloutWorkers, you should already be supplied with the necessary columns.

Have a look at the custom rollout worker workflow example if you want to find a good place to start your data collection routine.

The observations in the .json files that are included with RLlib are not encoded afaics. Can you please reference what you are writing about?

evo11x · July 5, 2022, 3:26pm

Thanks, I will take a look. I use APPO because it is the fastest and I see that is recommended in many places in the rllib documentation.

evo11x · July 5, 2022, 4:55pm

This is how the .json file observation looks like
{“type”: “SampleBatch”, “obs”: "BCJNGEhAinECAAAAAAB4AAABgIAFlSYAAAAAAAAAjBJudW1weS

how to convert it to float ?

arturn · July 5, 2022, 9:52pm

Please provide a script of how you create this.

evo11x · July 5, 2022, 10:07pm

import gym
import numpy as np
import ray
from ray import tune
from ray.rllib.agents import ppo
from ray.tune.logger import pretty_print

class MyEnv(gym.Env):
    def __init__(self, config=None):
        super(MyEnv, self).__init__()        

        self.timestep = 0
        self.action_space = gym.spaces.Box(
            low=-1, high=1, shape=(2,), dtype=np.float32)
        self.observation_space = gym.spaces.Box(low=-np.inf, high=np.inf, shape=(400,), dtype=np.float32)
              
    def _next_observation(self):
      obs = np.random.rand(400)
      return obs

    def _take_action(self, action):
      self._reward = 1      

    def step(self, action):        
        # Execute one time step within the environment
        self._reward = 0
        self._take_action(action)        
        done = False        
        obs = self._next_observation()

        self.timestep+=1
        if self.timestep>99: done=True

        return obs, self._reward, done, {}

    def reset(self):
        self._reward = 0
        self.total_reward = 0       
        self.visualization = None
        self.timestep=0
        return self._next_observation()



if __name__ == "__main__":
    ray.init()

    config = {
        "env": MyEnv,        
        "framework": "torch",
        "num_gpus": 1,
        "horizon": 100,        
        "num_workers": 2,
        "num_envs_per_worker": 2,
        "lr": 0.00001,
        "output": "TestOut",
        "output_max_file_size": 5000000,
    }

    stop = {
        "timesteps_total": 10
    }

    tune_flag = False
    if tune_flag:
        results = tune.run("PPO", config=config, stop=stop)
    else:        
          agent = ppo.APPOTrainer(config=config, env=MyEnv)
          for _ in range(100):
            result = agent.train()
            print(pretty_print(result))
    ray.shutdown()

mannyv · July 5, 2022, 10:13pm

@evo11x,

In the config there is a key with these defaults:

   "output_compress_columns": ["obs", "new_obs"],

If you change that to an empty list it will not compress them. The compression used is LZ4.

You can find the documentation here:

https://docs.ray.io/en/latest/rllib/rllib-training.html#common-parameters

mannyv · July 5, 2022, 10:29pm

The actual compression code is here:

github.com

ray-project/ray/blob/262d6121bb8f27dedfbc2ac6fe7f1bd498071ff8/rllib/utils/compression.py#L31


      
              )
              LZ4_ENABLED = False
          
          

          
@DeveloperAPI
          def compression_supported():
              return LZ4_ENABLED
          
          

          
@DeveloperAPI
          def pack(data):
              if LZ4_ENABLED:
                  data = pickle.dumps(data)
                  data = lz4.frame.compress(data)
                  # TODO(ekl) we shouldn't need to base64 encode this data, but this
                  # seems to not survive a transfer through the object store if we don't.
                  data = base64.b64encode(data).decode("ascii")
              return data
          
          

          
@DeveloperAPI

evo11x · July 5, 2022, 10:44pm

@mannyv Thanks!
And if I want to use the compressed data and decompress it I assume I have to do a base64 decoding and then lz4.frame.decompress ?

mannyv · July 5, 2022, 10:51pm

@evo11x

You can just use

github.com

ray-project/ray/blob/262d6121bb8f27dedfbc2ac6fe7f1bd498071ff8/rllib/utils/compression.py#L58


      
          @DeveloperAPI
          def unpack(data):
              if LZ4_ENABLED:
                  data = base64.b64decode(data)
                  data = lz4.frame.decompress(data)
                  data = pickle.loads(data)
              return data
          
          

          
@DeveloperAPI
          def unpack_if_needed(data):
              if is_compressed(data):
                  data = unpack(data)
              return data
          
          

          
@DeveloperAPI
          def is_compressed(data):
              return isinstance(data, bytes) or isinstance(data, string_types)

Topic		Replies	Views
Offline rl training with custom action masking model and episodic offline data Offline RL	0	151	March 20, 2024
Sample Rule-Based Expert Demonstrations in Rllib RLlib	6	1271	January 24, 2023
"Working with offlien data" tutorial: .read_parquet loads parquet with observations as strings Offline RL	0	16	February 23, 2025
How to update/rebuild algorithm in order to add new offline datasets? Configure Algorithm, Training, Evaluation, Scaling	2	572	February 7, 2023
I want to save offline data for imaging learning. When it is a multi-agent environment, can only the trajectory of a particular agent be stored? RLlib	1	335	March 22, 2023

Change or Generate offline data

Related topics