Addition is All You Need for Energy-efficient Language Models Paper • 2410.00907 • Published 8 days ago • 119
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 11 items • Updated 14 days ago • 343
Phantom of Latent for Large Language and Vision Models Paper • 2409.14713 • Published 17 days ago • 27
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning Paper • 2409.12183 • Published 21 days ago • 35
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated 21 days ago • 235
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated 22 days ago • 203
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale Paper • 2409.08264 • Published 27 days ago • 42
VMAS: Video-to-Music Generation via Semantic Alignment in Web Music Videos Paper • 2409.07450 • Published 28 days ago • 10
Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis Paper • 2409.06135 • Published about 1 month ago • 14
SongCreator: Lyrics-based Universal Song Generation Paper • 2409.06029 • Published about 1 month ago • 19
MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery Paper • 2409.05591 • Published about 1 month ago • 26
Qihoo-T2X: An Efficiency-Focused Diffusion Transformer via Proxy Tokens for Text-to-Any-Task Paper • 2409.04005 • Published Sep 6 • 16
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22 • 113
Real-Time Video Generation with Pyramid Attention Broadcast Paper • 2408.12588 • Published Aug 22 • 13
Jamba-1.5 Collection The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated Aug 22 • 77
view article Article Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging By akjindal53244 • Aug 19 • 72
abliterated-v3 Collection Latest gen of the abliterated models I've produced • 17 items • Updated Jun 3 • 92
Qwen2-Audio Collection Audio-language model series based on Qwen2 • 4 items • Updated 22 days ago • 41
Vision Language Leaderboards Collection This collection has all the vision language leaderboards. • 7 items • Updated Aug 24 • 9
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning Paper • 2407.20798 • Published Jul 30 • 23
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents Paper • 2407.16741 • Published Jul 23 • 67
MOMAland: A Set of Benchmarks for Multi-Objective Multi-Agent Reinforcement Learning Paper • 2407.16312 • Published Jul 23 • 12
OutfitAnyone: Ultra-high Quality Virtual Try-On for Any Clothing and Any Person Paper • 2407.16224 • Published Jul 23 • 23
MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence Paper • 2407.16655 • Published Jul 23 • 27
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation Paper • 2407.15060 • Published Jul 21 • 9
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models Paper • 2407.15841 • Published Jul 22 • 39
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference Paper • 2407.14057 • Published Jul 19 • 43
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities Paper • 2407.14482 • Published Jul 19 • 24
view article Article ZebraLogic: Benchmarking the Logical Reasoning Ability of Language Models By yuchenlin • Jul 27 • 22
xLAM models Collection xLAM: A Family of Large Action Models to Empower AI Agent Systems: https://github.com/SalesforceAIResearch/xLAM • 9 items • Updated 1 day ago • 40
MUSCLE: A Model Update Strategy for Compatible LLM Evolution Paper • 2407.09435 • Published Jul 12 • 20
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models Paper • 2407.09025 • Published Jul 12 • 125
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients Paper • 2407.08296 • Published Jul 11 • 31
LLM Compiler Collection Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. • 4 items • Updated Jun 27 • 147
👿 Daredevil-8B Collection Fine-tuned abliterated merge of the best Llama 3 8B model. Highest MMLU score in its category. • 5 items • Updated Aug 16 • 8
LLM Leaderboard best models ❤️🔥 Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: • 264 items • Updated Jun 22 • 401
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published Jun 20 • 85
Quantized Models (GGUF, IQ, Imatrix) Collection Various quantizations of models in the GGUF format. Models with a "checkmark" are personal favorites. An "orange arrow" means it's being uploaded. • 87 items • Updated 7 days ago • 46
Personal Favorites Collection Recommended models I use often or like for any reason. I recommend reading their cards for more details. • 9 items • Updated Aug 13 • 52
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback Paper • 2406.00888 • Published Jun 2 • 30
Vision Language Models Papers 🖼️💬📝 Collection Papers about vision-language models, most important ones are on top of the list. • 27 items • Updated Apr 30 • 32
Training-Free Long-Context Scaling of Large Language Models Paper • 2402.17463 • Published Feb 27 • 19