Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency Paper • 2310.03734 • Published Oct 5, 2023 • 15
Small-scale proxies for large-scale Transformer training instabilities Paper • 2309.14322 • Published Sep 25, 2023 • 20
Neurons in Large Language Models: Dead, N-gram, Positional Paper • 2309.04827 • Published Sep 9, 2023 • 17