Is there a mix between truncate_episodes and complete_episodes?

How severe does this issue affect your experience of using Ray?

  • Low: It annoys or frustrates me for a moment.

I was simply wondering if it is possible to only have my agent train for let’s say 10 episodes, but instead of training being conducted at the end of each episode (akin to what complete_episodes allows for), allow for training to be done throughout the episode (akin to what truncate_episodes allows for).

For example, I might want to train an agent for a max of 10 episodes, where the number of steps each episode takes can vary, but I want the updating of my network to occur every 512 steps taken - not at the end of each epsiode.

Right now I can approximate this by using truncate_episodes. However, if I specify to train for exactly 10 episodes using stop: {episodes_total: 10} with tune.run(). Sadly, this usually it trains for something like 10 episodes and a percentage of an additional episode.