Case2Code: Learning Inductive Reasoning with Synthetic Data Paper • 2407.12504 • Published 5 days ago • 6 • 5
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated Paper • 2407.10969 • Published 7 days ago • 16 • 3
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients Paper • 2407.08296 • Published 11 days ago • 28 • 3
Autoregressive Speech Synthesis without Vector Quantization Paper • 2407.08551 • Published 11 days ago • 12 • 4
Inference Performance Optimization for Large Language Models on CPUs Paper • 2407.07304 • Published 12 days ago • 47 • 7
ProgressGym: Alignment with a Millennium of Moral Progress Paper • 2406.20087 • Published 24 days ago • 3 • 2
Wavelets Are All You Need for Autoregressive Image Generation Paper • 2406.19997 • Published 24 days ago • 27 • 5
Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs Paper • 2407.00653 • Published 22 days ago • 11 • 2
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS Paper • 2406.18009 • Published 26 days ago • 18 • 3
RegMix: Data Mixture as Regression for Language Model Pre-training Paper • 2407.01492 • Published 21 days ago • 30 • 5
MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation Paper • 2407.00468 • Published 23 days ago • 35 • 2
Simulating Classroom Education with LLM-Empowered Agents Paper • 2406.19226 • Published 25 days ago • 28 • 8
MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression Paper • 2406.14909 • Published Jun 21 • 12 • 4
Long Code Arena: a Set of Benchmarks for Long-Context Code Models Paper • 2406.11612 • Published Jun 17 • 20 • 3
TroL: Traversal of Layers for Large Language and Vision Models Paper • 2406.12246 • Published Jun 18 • 34 • 2
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence Paper • 2406.11931 • Published Jun 17 • 54 • 3
Designing a Dashboard for Transparency and Control of Conversational AI Paper • 2406.07882 • Published Jun 12 • 9 • 4
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published Apr 29 • 116 • 9
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published Apr 22 • 124 • 14
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions Paper • 2404.13208 • Published Apr 19 • 38 • 9
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies Paper • 2404.06395 • Published Apr 9 • 18 • 1
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent Paper • 2404.03648 • Published Apr 4 • 23 • 3
PointInfinity: Resolution-Invariant Point Diffusion Models Paper • 2404.03566 • Published Apr 4 • 13 • 1
Learning to Decode Collaboratively with Multiple Language Models Paper • 2403.03870 • Published Mar 6 • 17 • 6
Orca-Math: Unlocking the potential of SLMs in Grade School Math Paper • 2402.14830 • Published Feb 16 • 24 • 3
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset Paper • 2402.10176 • Published Feb 15 • 33 • 4
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss Paper • 2402.10790 • Published Feb 16 • 40 • 8
Premise Order Matters in Reasoning with Large Language Models Paper • 2402.08939 • Published Feb 14 • 24 • 3
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data Paper • 2402.08093 • Published Feb 12 • 53 • 9
Direct Language Model Alignment from Online AI Feedback Paper • 2402.04792 • Published Feb 7 • 25 • 3
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs Paper • 2402.04291 • Published Feb 6 • 48 • 3
ReGAL: Refactoring Programs to Discover Generalizable Abstractions Paper • 2401.16467 • Published Jan 29 • 7 • 2
Proactive Detection of Voice Cloning with Localized Watermarking Paper • 2401.17264 • Published Jan 30 • 15 • 4
Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance Paper • 2401.15687 • Published Jan 28 • 20 • 4
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling Paper • 2401.15977 • Published Jan 29 • 35 • 8
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling Paper • 2401.16380 • Published Jan 29 • 46 • 7
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design Paper • 2401.14112 • Published Jan 25 • 17 • 7
Orion-14B: Open-source Multilingual Large Language Models Paper • 2401.12246 • Published Jan 20 • 10 • 2
Scalable Pre-training of Large Autoregressive Image Models Paper • 2401.08541 • Published Jan 16 • 35 • 6