Hallucination Detox: Sensitive Neuron Dropout (SeND) for Large Language Model Training Paper • 2410.15460 • Published Oct 20 • 1
Scavenging Hyena: Distilling Transformers into Long Convolution Models Paper • 2401.17574 • Published Jan 31 • 15