Piccolo2: General Text Embedding with Multi-task Hybrid Loss Training Paper • 2405.06932 • Published 12 days ago • 15
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding Paper • 2405.08748 • Published 9 days ago • 17
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published 24 days ago • 110
view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation 25 days ago • 69
InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD Paper • 2404.06512 • Published Apr 9 • 29
Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity Paper • 2403.14403 • Published Mar 21 • 6
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models Paper • 2404.02258 • Published Apr 2 • 101
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper • 2403.13372 • Published Mar 20 • 55
Design2Code: How Far Are We From Automating Front-End Engineering? Paper • 2403.03163 • Published Mar 5 • 92
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Paper • 2402.17485 • Published Feb 27 • 182
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models Paper • 2402.19427 • Published Feb 29 • 49
Speculative Streaming: Fast LLM Inference without Auxiliary Models Paper • 2402.11131 • Published Feb 16 • 41
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs Paper • 2402.04291 • Published Feb 6 • 48
LongAlign: A Recipe for Long Context Alignment of Large Language Models Paper • 2401.18058 • Published Jan 31 • 21
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5 • 61
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models Paper • 2401.15947 • Published Jan 29 • 46
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model Paper • 2401.16420 • Published Jan 29 • 54
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design Paper • 2401.14112 • Published Jan 25 • 17
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence Paper • 2401.14196 • Published Jan 25 • 44
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 253
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection Paper • 2310.11511 • Published Oct 17, 2023 • 62
BitNet: Scaling 1-bit Transformers for Large Language Models Paper • 2310.11453 • Published Oct 17, 2023 • 94
Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models Paper • 2308.00304 • Published Aug 1, 2023 • 22
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback Paper • 2307.15217 • Published Jul 27, 2023 • 34
PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback Paper • 2307.14936 • Published Jul 27, 2023 • 41
On the Origin of LLMs: An Evolutionary Tree and Graph for 15,821 Large Language Models Paper • 2307.09793 • Published Jul 19, 2023 • 45
Challenges and Applications of Large Language Models Paper • 2307.10169 • Published Jul 19, 2023 • 46
Learning to Retrieve In-Context Examples for Large Language Models Paper • 2307.07164 • Published Jul 14, 2023 • 20
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper • 2307.09288 • Published Jul 18, 2023 • 235