-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 22 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 16 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 9 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 10
Collections
Discover the best community collections!
Collections including paper arxiv:2406.10209
-
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
Paper • 2406.10209 • Published • 8 -
tomg-group-umd/3-goldfish-loss-llama-1B
Text2Text Generation • Updated • 18 -
tomg-group-umd/4-goldfish-loss-llama-1B
Text2Text Generation • Updated • 21 -
tomg-group-umd/8-goldfish-loss-llama-1B
Text2Text Generation • Updated • 16
-
DataComp-LM: In search of the next generation of training sets for language models
Paper • 2406.11794 • Published • 50 -
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
Paper • 2406.10209 • Published • 8 -
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 52 -
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Paper • 2406.11931 • Published • 57