How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
Hi,
I have recently picked up RLLib, though I have a background in RL (specifically bandits). I’m trying to find an elegant way to save trajectories (or, more specifically, observations) observed by my agent in memory (not disk).
I’ve read this post: How to save PPO trajectory and train at a later time that hits pretty close to my target, however I am wondering if there is a way to save these trajectories in memory to avoid extensive I/O to the disk. I understand that callbacks seems to be the way to do this, however, it seems like the only way to do this by using the dictionaries in their “intended” way would be to use the custom_metrics
dictionary and just return raw custom metrics by using the configuration key "keep_per_episode_custom_metrics"
(this would be a duct tape patch if it did work, imo).
Alternatively, I could save the states in the environment itself, but having read through the documentation for 2 days now, I can’t find a way to access a RolloutWorker’s attributes / environment object.