Hi Guys,
My experiment.json files are getting huge pretty fast. Running RLlib at scale (via Tune) on a big cluster makes the shared drive explode. I might be missing - why is experiment.json even being logged? Is it valuable to post-training analysis? Is there any automation/tool to analyze it?
And anyway, I’d opt for letting users toggle it off to save storage space.