Addition is All You Need for Energy-efficient Language Models Paper • 2410.00907 • Published Oct 1 • 144 • 17
Addition is All You Need for Energy-efficient Language Models Paper • 2410.00907 • Published Oct 1 • 144 • 17
Addition is All You Need for Energy-efficient Language Models Paper • 2410.00907 • Published Oct 1 • 144
Quantifying Generalization Complexity for Large Language Models Paper • 2410.01769 • Published Oct 2 • 13
Quantifying Generalization Complexity for Large Language Models Paper • 2410.01769 • Published Oct 2 • 13
Quantifying Generalization Complexity for Large Language Models Paper • 2410.01769 • Published Oct 2 • 13 • 2
Addition is All You Need for Energy-efficient Language Models Paper • 2410.00907 • Published Oct 1 • 144 • 17
Training Task Experts through Retrieval Based Distillation Paper • 2407.05463 • Published Jul 7 • 7
Training Task Experts through Retrieval Based Distillation Paper • 2407.05463 • Published Jul 7 • 7
Training Task Experts through Retrieval Based Distillation Paper • 2407.05463 • Published Jul 7 • 7 • 1
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts Paper • 2406.12034 • Published Jun 17 • 14
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts Paper • 2406.12034 • Published Jun 17 • 14
Chain of Thought Prompt Tuning in Vision Language Models Paper • 2304.07919 • Published Apr 16, 2023
Self-Specialization: Uncovering Latent Expertise within Large Language Models Paper • 2310.00160 • Published Sep 29, 2023
HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding Paper • 2403.00425 • Published Mar 1 • 1
Improving Neural Language Models by Segmenting, Attending, and Predicting the Future Paper • 1906.01702 • Published Jun 4, 2019
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models Paper • 2309.03883 • Published Sep 7, 2023 • 34