Neural Arch

xb-chang 's Collections

LLMs

updated Jul 22, 2024

Associative Recurrent Memory Transformer

Paper • 2407.04841 • Published Jul 5, 2024 • 35

Note a neural architecture for very long sequences that requires constant time for processing new information at each time step. transformer self-attention for local context and segment-level recurrence for storage of task specific information distributed over a long context [2R; for 持续学习？]
Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Paper • 2407.04620 • Published Jul 5, 2024 • 31

Note We propose a new class of sequence modeling layers with linear complexity and an expressive hidden state. The key idea is to make the hidden state a machine learning model itself, and the update rule a step of self-supervised learning. Since the hidden state is updated by training even on test sequences, our layers are called Test-Time Training (TTT) layers. [2R; for 持续学习？与RNN有关]