How to export/get the latest data of the env class after training?

Hi,
I wonder if anybody knows how to export the env data after the training is finished? For example, I have a, let’s say counter in the step method of my custom env. I would like to access the counter after training. If I access the env class after training, I can simply print(env.counter). But I don’t know how to get the env. I am using both trainer and tune.

Thanks!

I have been using callbacks (as showed here) for logging and accessing data about the environment. It might not be the easiest solution, but if you want to use tune it can be tricky access that stuff without using the built-in methods such as this.

Hi @albheim,
Thanks for your reply. Yes, I tried to use callbacks and ran the example you sent to see how to use a callback. But this example does not work on my machine. This is the error:


<IPython.core.display.HTML object>
(pid=186845) 2021-10-28 14:24:18,952	INFO trainer.py:758 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(pid=186845) 2021-10-28 14:24:19,035	WARNING trainer_template.py:185 -- `execution_plan` functions should accept `trainer`, `workers`, and `config` as args!
(PG pid=186845) episode 982809536 (env-idx=0) started.
(PG pid=186845) episode 506580114 (env-idx=1) started.
(PG pid=186845) postprocessed 11 steps
2021-10-28 14:24:19,568	ERROR trial_runner.py:846 -- Trial PG_CartPole-v0_f84ea_00000: Error processing event.
Traceback (most recent call last):
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/tune/trial_runner.py", line 812, in _process_trial
    results = self.trial_executor.fetch_result(trial)
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/tune/ray_trial_executor.py", line 767, in fetch_result
    result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 89, in wrapper
    return func(*args, **kwargs)
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/worker.py", line 1621, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::PG.train_buffered() (pid=186845, ip=129.132.204.244, repr=PG)
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/tune/trainable.py", line 189, in train_buffered
    result = self.train()
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 670, in train
    raise e
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 656, in train
    result = Trainable.train(self)
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/tune/trainable.py", line 248, in train
    result = self.step()
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 206, in step
    step_results = next(self.train_exec_impl)
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 756, in __next__
    return next(self.built_iterator)
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 843, in apply_filter
    for item in it:
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 843, in apply_filter
    for item in it:
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 876, in apply_flatten
    for item in it:
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/rllib/execution/rollout_ops.py", line 75, in sampler
    yield workers.local_worker().sample()
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 752, in sample
    batches = [self.input_reader.next()]
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/rllib/evaluation/sampler.py", line 103, in next
    batches = [self.get_data()]
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/rllib/evaluation/sampler.py", line 233, in get_data
    item = next(self._env_runner)
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/rllib/evaluation/sampler.py", line 599, in _env_runner
    _process_observations(
  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/rllib/evaluation/sampler.py", line 923, in _process_observations
    callbacks.on_episode_end(
  File "/home/rdbt/ETHZ/cb.py", line 58, in on_episode_end
    assert episode.batch_builder.policy_collectors[
AttributeError: '_PolicyCollector' object has no attribute 'batches'
Result for PG_CartPole-v0_f84ea_00000:
  {}
  
<IPython.core.display.HTML object>
(pid=186845) [2021-10-28 14:24:19,769 E 186845 187332] raylet_client.cc:159: IOError: Broken pipe [RayletClient] Failed to disconnect from raylet.
Traceback (most recent call last):

  File "/home/rdbt/ETHZ/cb.py", line 100, in <module>
    trials = tune.run(

  File "/home/rdbt/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/tune/tune.py", line 611, in run
    raise TuneError("Trials did not complete", incomplete_trials)

TuneError: ('Trials did not complete', [PG_CartPole-v0_f84ea_00000])

Hi @deepgravity ,

I had a similar problem: I wanted to store environment data and configuration for later detailed analysis (actually also needed such data from the policy - and this is a little more cumbersome than storing from the environment).

As a solution I used RLlib’s Offline API that allows us to store observations, actions and rewards (and more) for later use (i.e. offline training). It can be of course mis-used to store the data in a digestable form (JSON format) that can be read in by the JsonReader class.

All you have to do is to store the additional data you need into the info dictionary that gets returned from the environments step() function. The rest is taken care of by RLlib writing out the SampleBatches that include also this info dictionary with your data. Take a look at the definition of the SampleBatch class and see, if the information you want to store is already stored in this class - if not: store your data into the info dictionary and RLlib takes care of the rest.

Hope this helps here

Hi @Lars_Simon_Zehnder,
Many thanks. It is a great idea to store into the info object.

I followed the links your sent and stored the env-data I needed into the info. But, I got the following error. RLlib cannot serialize the data:

Traceback (most recent call last):

  File "/home/data/dbt/housing_design/agents_floorplan/my_runner.py", line 135, in <module>
    main()

  File "/home/data/dbt/housing_design/agents_floorplan/my_runner.py", line 130, in main
    runner.start()

  File "/home/data/dbt/housing_design/agents_floorplan/my_runner.py", line 118, in start
    env_to_agent.learn()

  File "/home/data/dbt/housing_design/agents_floorplan/env_to_agent.py", line 84, in learn
    self._run_the_trainer()

  File "/home/data/dbt/housing_design/agents_floorplan/env_to_agent.py", line 76, in _run_the_trainer
    self.my_trainer._the_trainer()

  File "/home/data/dbt/housing_design/agents_floorplan/my_trainer.py", line 66, in _the_trainer
    self.result = self.agent.train()

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 659, in train
    raise e

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 648, in train
    result = Trainable.train(self)

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/tune/trainable.py", line 239, in train
    result = self.step()

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 206, in step
    step_results = next(self.train_exec_impl)

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 756, in __next__
    return next(self.built_iterator)

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 843, in apply_filter
    for item in it:

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 843, in apply_filter
    for item in it:

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 843, in apply_filter
    for item in it:

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 1075, in build_union
    item = next(it)

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 756, in __next__
    return next(self.built_iterator)

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 471, in base_iterator
    yield ray.get(futures, timeout=timeout)

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 89, in wrapper
    return func(*args, **kwargs)

  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/worker.py", line 1621, in get
    raise value.as_instanceof_cause()

RayTaskError(TypeError): ray::RolloutWorker.par_iter_next() (pid=15129, ip=192.168.16.146, repr=<ray.rllib.evaluation.rollout_worker.modify_class.<locals>.Class object at 0x7f4a8803e460>)
  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/util/iter.py", line 1151, in par_iter_next
    return next(self.local_it)
  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 373, in gen_rollouts
    yield self.sample()
  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 769, in sample
    self.output_writer.write(batch)
  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/rllib/offline/json_writer.py", line 68, in write
    data = _to_json(sample_batch, self.compress_columns)
  File "/home/data/anaconda3/envs/rllib/lib/python3.9/site-packages/ray/rllib/offline/json_writer.py", line 125, in _to_json
    return json.dumps(out)
  File "/home/data/anaconda3/envs/rllib/lib/python3.9/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/home/data/anaconda3/envs/rllib/lib/python3.9/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/home/data/anaconda3/envs/rllib/lib/python3.9/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/home/data/anaconda3/envs/rllib/lib/python3.9/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type int64 is not JSON serializable

I tracked the error in debug mode. It is related to the last line of the following method:

def _to_json(batch: SampleBatchType, compress_columns: List[str]) -> str:
    out = {}
    if isinstance(batch, MultiAgentBatch):
        out["type"] = "MultiAgentBatch"
        out["count"] = batch.count
        policy_batches = {}
        for policy_id, sub_batch in batch.policy_batches.items():
            policy_batches[policy_id] = {}
            for k, v in sub_batch.items():
                policy_batches[policy_id][k] = _to_jsonable(
                    v, compress=k in compress_columns)
        out["policy_batches"] = policy_batches
    else:
        out["type"] = "SampleBatch"
        for k, v in batch.items():
            out[k] = _to_jsonable(v, compress=k in compress_columns)
    return json.dumps(out)

And this how the out variable looks like:

Anybody knows how to resolve this issue?

Thanks!

@deepgravity , do you have somewhere in your observation space int64 variables? I think I remember that I had similar problems with non-float variables.

Hi @Lars_Simon_Zehnder ,
Thank you for your reply.

I fixed the issue mentioned above. The problem was that I needed to store the data into info dictionary as dictionary like : info = {"agent_1": some_dict}.

Now, I can run the code without that error, but I have another problem.

It seems that RLlib stores my env data into the following JSON file that the tune automatically creates. Is that correct?
basic-variant-state-2021-11-09_11-38-01.json

I used the following JsonReaser to read the above JSON file.

from ray.rllib.offline.json_reader import JsonReader

But I got the following error:


  File "D:\ETHZ\dbt_python\housing_design\agents_floorplan\my_json_reader.py", line 15, in <module>
    batch = reader.next()

  File "C:\Users\Reza\Anaconda3\envs\rlb\lib\site-packages\ray\rllib\offline\json_reader.py", line 88, in next
    batch = self._try_parse(self._next_line())

  File "C:\Users\Reza\Anaconda3\envs\rlb\lib\site-packages\ray\rllib\offline\json_reader.py", line 200, in _next_line
    line = self.cur_file.readline()

  File "C:\Users\Reza\Anaconda3\envs\rlb\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 82: character maps to <undefined>

Have you encountered this error?

Thanks!

Update:

I figured out what the issue was.

RLlib does not store the env data including info into basic-variant-state-2021-11-09_11-38-01.json but it stores the data in the directory assigned by you. To assign a directory you need to pass it to the config.update({"output": output_dir, "output_max_file_size": 5000000}) where the output_dir is your desired directory for example home/data. Meaning that you need to update your config as above when you want to store env data.

After terminating the training, RLlib stores a JSON file like output-2021-11-09_18-36-32_worker-1_0.json into your output_dir. But, reading this JSON file is not straightforward as it contains batches of data. I think now we need to use JsonReader, but it gives me the following error! Anyone knows how to fix it?

    for episode in batch.split_by_episode():
AttributeError: 'MultiAgentBatch' object has no attribute 'split_by_episode'

When I open the JSON file with a JSON reader, I can see the env data, but reading it programmatically needs a bit of work!

I hope this info helps the other people who need to store env data.

Thanks, @Lars_Simon_Zehnder for suggesting storing data into the info dictionary.

Hi @deepgravity ,

happy to hear that you figured it out how to store your variables and write them out.

Regarding your problem, the MultiAgentBatch has indeed no split_by_episode() method (see the class definition).

To understand better what is happening here, I need some more more information. A code example at least. How do you use the JsonReader?

Hi @Lars_Simon_Zehnder
Thank you very much.

I am using the following code from this link: RLlib Offline Datasets — Ray v1.8.0

from ray.rllib.offline.json_reader import JsonReader

json_path = "output-2021-11-09_18-36-32_worker-1_0.json"
reader = JsonReader(json_path)
for _ in range(2):
    batch = reader.next()
    print(batch)
    for episode in batch.split_by_episode():
        infos = batch.policy_batches['policy_1']['infos']

Maybe it was deprecated already.

Hi @deepgravity ,

from a view on the source code it is obvious that a MultiAgentBatch does not have a split_by_episode() method (only regular SampleBatches have).

The batch in your code is seemingly a MultiAgentBatch (check this by using isinstance()) and has therefore no split_by_episode() method - so you receive an exception. Not having data to test on I can make a remote suggestion how to code it:

infos = []
for p, b in batch.policy_batches.items():
     infos.append(b["infos"])
     # OR, if you need the results by episode
     # batch_list = b.split_by_epsiode()
     # for eps_batch in batch_list:
     #       infos.append(eps_batch["infos"])

I do not know, if you are running a multi-agent algorithm or have simply the MultiAgentBatch returned because of running in a real cluster. There is also a wrap_as_needed() method for MultiAgentBatches. Having only a single agent (if not a multi-agent algorithm was used) returns you then a regular SampleBatch that allows you to apply split_by_epsiode() directly.

Hope this helps getting it run

Hi @Lars_Simon_Zehnder
Thank you very much for your reply.
Yes, it was really helpful. Thanks :slight_smile:

This is how I wrote the code for those who need:

json_path = "output-2021-11-21_00-39-35_worker-1_0.json"
reader = JsonReader(json_path)

infos_list = []
dones_list = []
for batch in reader.read_all_files():
    batch_list = batch.split_by_episode()
    dones = []
    infos = []
    for eps_batch in batch_list:
        infos.append(eps_batch["infos"])
        dones.append(eps_batch["dones"])
    
    infos_list.extend(infos[0])
    dones_list.extend(dones)
2 Likes