Saving evaluation episodes to files

Hey everyone,
I’m trying to save the output episodes of my policy evaluation to files (e.g. json/jsonl) so I can look at the evolution after training. Is there any way to do this? Due to storage constraints I just want the evaluation episodes, not the training episodes, otherwise I would guess AlgorithmConfig.offline_data would do the job.

A few years ago it was possible to achieve this by nesting an output dict into the evaluation config dict, but I am unsure how to achieve this now, since the way the framework is configured changed drastically.

Thanks a lot in advance.

Best regards,
André

For the people that stumble on this in the future:

I was unable to find any configuration that enabled me to log the files using the offline_data configuration to only log evaluation episodes without writing the training episodes. There seems to be an inherent problem with the tensorflow implementation and the offline data logging during evaluation.

I updated my RLLib/Ray install to ray==2.24.0 (good job on the new RLLib interfaces btw) and implement a custom evaluation function:

# Source: https://discuss.ray.io/t/saving-evaluation-episodes-to-files/14780
def custom_eval_function(algorithm, eval_workers):    
    # Each eval worker samples *one* episode
    episodes = eval_workers.foreach_worker(func=lambda w: w.sample(), local_worker=False)
    metrics = eval_workers.foreach_worker(func=lambda w: w.get_metrics(), local_worker=False)

    # Calculate the metrics for logging
    algorithm.metrics.merge_and_log_n_dicts(
        metrics, key=(EVALUATION_RESULTS, ENV_RUNNER_RESULTS)
    )
    eval_results = algorithm.metrics.reduce(
        key=(EVALUATION_RESULTS, ENV_RUNNER_RESULTS)
    )

    # Iterate the workers with some respective worker index
    for wi, w in enumerate(episodes):
        ioctx = IOContext(
            log_dir=algorithm._logdir,
            config=None,
            worker_index=wi,
        )
        json_writer = JsonWriter(
            path=algorithm._logdir,
            ioctx=ioctx,
            compress_columns=['obs', 'new_obs', 'infos'],
        )
        
        # Set the file index to the current training iteration
        json_writer.file_index = algorithm.training_iteration

        # Write all episodes that this worker gathered (should just be one for now)
        for e in w:
            json_writer.write(e.get_sample_batch())
            
    return eval_results

This can be used in the configuration as follows:

# [...]
.evaluation(
    # [...]
    custom_evaluation_function=custom_eval_function,
    # [...]
)

This way, I can now also log the evaluation episodes for my tensorflow models. Maybe this helps someone.

For me, a single episode per worker is enough, but if you extend it, feel free to post it here, so others can have it too!

Cheers,
André