Papers - Training Research - Clamping Modifying activations during training with proper gradient flow Collection by matlok May 5 - What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
Papers - Training Research - Loss Dynamics - Clamping Collection by matlok May 5 - What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
Papers - Attention - Previous Token Head Collection by matlok May 5 - What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
Papers - Training - Ablation Collection by matlok May 4 - What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
Papers - Attention - Ablation Collection by matlok May 4 - What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
Papers - Attention - Induction Heads Collection by matlok May 4 - What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3 Are Sixteen Heads Really Better than One? Paper • 1905.10650 • Published May 25, 2019 • 2
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
Papers - XAI - Attention - Induction Heads Collection by matlok May 4 - What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3 Are Sixteen Heads Really Better than One? Paper • 1905.10650 • Published May 25, 2019 • 2
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
Papers - Ablation - Attention - Head Pruning Causal ablations taking into account LayerNorm Collection by matlok May 4 - What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3 Are Sixteen Heads Really Better than One? Paper • 1905.10650 • Published May 25, 2019 • 2
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
Papers - Custom Layers - Residual Connection - Ablation Collection by matlok May 4 - What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
Papers - Custom Layers - No Dropout - Dropout Regularization Collection by matlok May 4 - What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3