Is there a way to add keys to a SampleBatch if rollout_fragment_length = 1?

esquires · September 11, 2022, 1:00am

How severe does this issue affect your experience of using Ray?

Medium: It contributes to significant difficulty to complete my task, but I can work around it.

TLDR; 2 questions:

is there a technical reason that SAC has the rollout_fragment_length set to 1? I’d like to make it bigger than the length of an episode for my environment and I am wondering if changing this default value will present an issue.
Is there a way to add data to the sample_batch used for a custom_loss without relying on writing a postprocess_fn (which is limited by rollout_fragment_length setting)? if possible this would be preferable to changing the rollout_fragment_length

More detail;
I am looking at computing statistics based on entire episodes to create a loss function for an RL system. In SAC, the rollout_fragment_length is set to 1 but in the parent classes it has different values (SimpleQ has rollout_fragment_length = 4 and it looks like DQN inherits this value).

The reason this matters is that in a policy’s postprocess_fn, it takes in a sample_batch which has data with a length equal to rollout_fragment_length. This for instance means that I can’t do what is described here in SAC without changing the rollout_fragment_length.

Of course, changing rollout_fragment_length merely so I can add data to the dataset is a bit indirect and will affect other training factors (e.g. if the train batch is 256 and rollout_fragment_length is now set to 512 there is a lot of wasted computation) that are generally more important. If there is an alternative approach to doing a supervised loss on an entire trajectory without having to change rollout_fragment_length that would be better

arturn · September 11, 2022, 4:13pm

Hi @esquires ,

You can change the rollout_fragment_length in SAC - no issue with that.
The amount of computation that you add by setting rollout_fragment_length=512 and train_batch_size=256 does not seem relevant to me. But rollout_fragment_length=256 would seem reasonable in that case.
Are you talking about adding offline data to your experience stream? If so, you can do something like "input": {"some_input_file": 0.5, "sampler": 0.5}. Have a look at our examples. for more context

Topic		Replies	Views
How does "rollout_fragment_length" in the specification for the trainer interact with "max_seq_len" in the specification for the model? RLlib	6	1760	July 14, 2021
Why auto-adjust `rollout_fragment_length` by a floor division instead of ceiling operation? RLlib	2	356	June 9, 2022
Increase wait time for trainer when using PolicyServer data input RLlib	0	184	May 17, 2024
PPO algorithms train buffer only collects the first fragment from each worker? RLlib	4	719	October 30, 2021
Pong PPO from tuned example v2.4.0 not converging RLlib	4	454	May 27, 2023

Is there a way to add keys to a SampleBatch if rollout_fragment_length = 1?

Related topics