Saving evaluation episodes to files

abrandenb · May 24, 2024, 12:23pm

Hey everyone,
I’m trying to save the output episodes of my policy evaluation to files (e.g. json/jsonl) so I can look at the evolution after training. Is there any way to do this? Due to storage constraints I just want the evaluation episodes, not the training episodes, otherwise I would guess AlgorithmConfig.offline_data would do the job.

A few years ago it was possible to achieve this by nesting an output dict into the evaluation config dict, but I am unsure how to achieve this now, since the way the framework is configured changed drastically.

Thanks a lot in advance.

Best regards,
André

abrandenb · June 19, 2024, 9:46am

For the people that stumble on this in the future:

I was unable to find any configuration that enabled me to log the files using the offline_data configuration to only log evaluation episodes without writing the training episodes. There seems to be an inherent problem with the tensorflow implementation and the offline data logging during evaluation.

I updated my RLLib/Ray install to ray==2.24.0 (good job on the new RLLib interfaces btw) and implement a custom evaluation function:

# Source: https://discuss.ray.io/t/saving-evaluation-episodes-to-files/14780
def custom_eval_function(algorithm, eval_workers):    
    # Each eval worker samples *one* episode
    episodes = eval_workers.foreach_worker(func=lambda w: w.sample(), local_worker=False)
    metrics = eval_workers.foreach_worker(func=lambda w: w.get_metrics(), local_worker=False)

    # Calculate the metrics for logging
    algorithm.metrics.merge_and_log_n_dicts(
        metrics, key=(EVALUATION_RESULTS, ENV_RUNNER_RESULTS)
    )
    eval_results = algorithm.metrics.reduce(
        key=(EVALUATION_RESULTS, ENV_RUNNER_RESULTS)
    )

    # Iterate the workers with some respective worker index
    for wi, w in enumerate(episodes):
        ioctx = IOContext(
            log_dir=algorithm._logdir,
            config=None,
            worker_index=wi,
        )
        json_writer = JsonWriter(
            path=algorithm._logdir,
            ioctx=ioctx,
            compress_columns=['obs', 'new_obs', 'infos'],
        )
        
        # Set the file index to the current training iteration
        json_writer.file_index = algorithm.training_iteration

        # Write all episodes that this worker gathered (should just be one for now)
        for e in w:
            json_writer.write(e.get_sample_batch())
            
    return eval_results

This can be used in the configuration as follows:

# [...]
.evaluation(
    # [...]
    custom_evaluation_function=custom_eval_function,
    # [...]
)

This way, I can now also log the evaluation episodes for my tensorflow models. Maybe this helps someone.

For me, a single episode per worker is enough, but if you extend it, feel free to post it here, so others can have it too!

Cheers,
André

Topic		Replies	Views
Formatting output file names & organizing saved episodes RLlib	3	483	December 8, 2020
Cannot get a simple Evaluation to work as intended RLlib	6	388	September 5, 2022
Offline Evaluation from json Offline RL	0	289	September 22, 2023
Can ray allow access to individual episodes? RLlib	5	449	September 22, 2021
Offline data and off-policy estimation RLlib	4	710	July 20, 2022

Saving evaluation episodes to files

Related topics