Multi-head model functionality

renos · March 23, 2022, 1:14am

High: It blocks me to complete my task.

Hi all, trying to reproduce some of the following paper: https://arxiv.org/pdf/1611.05397.pdf

The functionality I need currently is to be able to train a common network with multiple ‘heads’, with each head trained on a different task.

I can implement the multiple tasks using the TaskSettableEnv, but from that point i’m not sure. Two questions:

is it possible to store experiences from different tasks separately in a reply buffer (say for the dqn replay buffer)
if 1 is true, would I need to write a custom loss function to train each tasks data on a different head?

christy · May 31, 2022, 4:22pm

This question was answered during May24 office hours. RLlib office hours: May 24 - YouTube

Suggestion was to use IMPALA algorithm instead of PPO since it has replay buffer.

kourosh · May 31, 2022, 10:35pm

@renos So if I understand correctly you essentially want to add auxiliary losses to your policy loss to jointly train your visual representation with the policy specific parameters?

RLlib provides a custom_loss() hook that allows such use cases. You can take a look at rllib/examples/custom_loss_and_metric.py to see how it’s used in practice. Ideally you want to use loss_inputs to get access to the train_batch sampled from the replay buffer. Within that function you can compute reward prediction loss or any other auxiliary losses that would improve the representation.

Topic		Replies	Views
Multi reward optimization RLlib	6	403	September 29, 2021
Behavior Cloning through custom env RLlib	4	503	August 13, 2021
Multi-objective RL RLlib	6	955	November 11, 2021
[RLlib] Multi-headed DQN RLlib	5	1325	June 13, 2021
Seeking recommendations for implementing Dual Curriculum Design in RLlib RLlib	13	668	April 11, 2023

Multi-head model functionality

Related topics