Hi i would like to train an actor for a custom enviroment using transformer like policy net. I came across the GTrXL net and would like if this supports multimodal input to the net? By this i mean a tokenizer for visual features (images) and perceptual features (joint states, etc.)
Related Topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Does the GTrXL Model supports dict/tuple observation? | 0 | 25 | April 16, 2024 | |
Running Custom Attention_net with RNNSAC | 3 | 570 | October 25, 2021 | |
RLlib compatible with GNNs (e.g. TF-GNN, GraphTensor) or Spektral | 6 | 832 | February 24, 2023 | |
RLlib sequencing for GTrXL | 1 | 90 | December 20, 2023 | |
Valid inputs for `state`, `seq_lens` in GTrXLNet | 2 | 180 | December 8, 2023 |