meta-llama/Llama-3.3-70B-Instruct Text Generation β’ Updated about 14 hours ago β’ 236k β’ β’ 1.23k
The Carbon Footprint of Machine Learning Training Will Plateau, Then Shrink Paper β’ 2204.05149 β’ Published Apr 11, 2022 β’ 7
meta-llama/Llama-3.2-11B-Vision-Instruct Image-Text-to-Text β’ Updated 18 days ago β’ 2.81M β’ β’ 1.14k
MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models Paper β’ 2410.13085 β’ Published Oct 16 β’ 20
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper β’ 2307.09288 β’ Published Jul 18, 2023 β’ 243
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper β’ 2404.14219 β’ Published Apr 22 β’ 253
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper β’ 2404.07143 β’ Published Apr 10 β’ 104
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper β’ 2312.00752 β’ Published Dec 1, 2023 β’ 138
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Paper β’ 1810.04805 β’ Published Oct 11, 2018 β’ 16
RoBERTa: A Robustly Optimized BERT Pretraining Approach Paper β’ 1907.11692 β’ Published Jul 26, 2019 β’ 7
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter Paper β’ 1910.01108 β’ Published Oct 2, 2019 β’ 14
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper β’ 2211.05100 β’ Published Nov 9, 2022 β’ 27
LLaMA: Open and Efficient Foundation Language Models Paper β’ 2302.13971 β’ Published Feb 27, 2023 β’ 13
Textbooks Are All You Need II: phi-1.5 technical report Paper β’ 2309.05463 β’ Published Sep 11, 2023 β’ 87
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper β’ 2402.14905 β’ Published Feb 22 β’ 126
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper β’ 2404.14619 β’ Published Apr 22 β’ 126
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model Paper β’ 2405.04434 β’ Published May 7 β’ 14
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence Paper β’ 2406.11931 β’ Published Jun 17 β’ 57
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context Paper β’ 2403.05530 β’ Published Mar 8 β’ 61
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper β’ 2412.03555 β’ Published 18 days ago β’ 118
Training Language Models to Self-Correct via Reinforcement Learning Paper β’ 2409.12917 β’ Published Sep 19 β’ 135
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking Paper β’ 2403.09629 β’ Published Mar 14 β’ 74
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Paper β’ 2409.12191 β’ Published Sep 18 β’ 74
Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities Paper β’ 2308.12966 β’ Published Aug 24, 2023 β’ 7
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks Paper β’ 2311.06242 β’ Published Nov 10, 2023 β’ 86
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models Paper β’ 2409.17146 β’ Published Sep 25 β’ 104
Chameleon: Mixed-Modal Early-Fusion Foundation Models Paper β’ 2405.09818 β’ Published May 16 β’ 126
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Paper β’ 2406.04325 β’ Published Jun 6 β’ 72
An Image is Worth 32 Tokens for Reconstruction and Generation Paper β’ 2406.07550 β’ Published Jun 11 β’ 55
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels Paper β’ 2406.09415 β’ Published Jun 13 β’ 50
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs Paper β’ 2406.16860 β’ Published Jun 24 β’ 58
ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning Paper β’ 2406.19741 β’ Published Jun 28 β’ 59
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper β’ 2411.04905 β’ Published Nov 7 β’ 111
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models Paper β’ 2402.17177 β’ Published Feb 27 β’ 88
Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper β’ 2402.03620 β’ Published Feb 6 β’ 112
Improved Baselines with Visual Instruction Tuning Paper β’ 2310.03744 β’ Published Oct 5, 2023 β’ 37
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images Paper β’ 2403.11703 β’ Published Mar 18 β’ 16
Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages Paper β’ 2308.12038 β’ Published Aug 23, 2023 β’ 2