A Primer on the Inner Workings of Transformer-based Language Models Paper • 2405.00208 • Published 21 days ago • 6
Neurons in Large Language Models: Dead, N-gram, Positional Paper • 2309.04827 • Published Sep 9, 2023 • 16