LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes Paper • 2311.13384 • Published Nov 22, 2023 • 50
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis Paper • 2311.12454 • Published Nov 21, 2023 • 30
Memory Augmented Language Models through Mixture of Word Experts Paper • 2311.10768 • Published Nov 15, 2023 • 16
HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models Paper • 2312.00079 • Published Nov 30, 2023 • 14
PromptBench: A Unified Library for Evaluation of Large Language Models Paper • 2312.07910 • Published Dec 13, 2023 • 15
Distributed Inference and Fine-tuning of Large Language Models Over The Internet Paper • 2312.08361 • Published Dec 13, 2023 • 25
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model Paper • 2312.11370 • Published Dec 18, 2023 • 20
Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model Paper • 2312.12423 • Published Dec 19, 2023 • 12
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling Paper • 2312.15166 • Published Dec 23, 2023 • 56
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation Paper • 2401.04092 • Published Jan 8 • 21
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon Paper • 2401.03462 • Published Jan 7 • 27
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation Paper • 2401.15688 • Published Jan 28 • 11
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs Paper • 2402.04291 • Published Feb 6 • 48
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay Paper • 2402.04858 • Published Feb 7 • 14
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models Paper • 2401.15947 • Published Jan 29 • 49
InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning Paper • 2402.06332 • Published Feb 9 • 18
Linear Transformers with Learnable Kernel Functions are Better In-Context Models Paper • 2402.10644 • Published Feb 16 • 79
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper • 2402.13753 • Published Feb 21 • 112
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling Paper • 2402.12226 • Published Feb 19 • 41
The FinBen: An Holistic Financial Benchmark for Large Language Models Paper • 2402.12659 • Published Feb 20 • 17
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT Paper • 2402.16840 • Published Feb 26 • 23
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs Paper • 2402.15627 • Published Feb 23 • 34
Training-Free Long-Context Scaling of Large Language Models Paper • 2402.17463 • Published Feb 27 • 19
Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward Paper • 2404.01258 • Published Apr 1 • 10
Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs Paper • 2403.20041 • Published Mar 29 • 34
OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data Paper • 2404.12195 • Published Apr 18 • 11
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published Apr 22 • 126
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published Apr 25 • 75
Layer-Condensed KV Cache for Efficient Inference of Large Language Models Paper • 2405.10637 • Published May 17 • 19