HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering Paper • 1809.09600 • Published Sep 25, 2018 • 2
Fast Inference from Transformers via Speculative Decoding Paper • 2211.17192 • Published Nov 30, 2022 • 3
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads Paper • 2401.10774 • Published Jan 19 • 50
Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding Paper • 2402.12374 • Published Feb 19 • 2
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published 18 days ago • 228
AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation Paper • 2404.12753 • Published 21 days ago • 38
Training-Free Long-Context Scaling of Large Language Models Paper • 2402.17463 • Published Feb 27 • 17
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent 18 days ago • 67
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 21 days ago • 491
Hydragen: High-Throughput LLM Inference with Shared Prefixes Paper • 2402.05099 • Published Feb 7 • 16
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community 25 days ago • 114
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale Paper • 2208.07339 • Published Aug 15, 2022 • 4
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers Paper • 2210.17323 • Published Oct 31, 2022 • 5
view article Article It's raining diffusion personalization techniques☔️🎭🖼️ By linoyts • 29 days ago • 15
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published 29 days ago • 92
view article Article Assisted Generation: a new direction toward low-latency text generation May 11, 2023 • 6
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders Paper • 2404.05961 • Published Apr 9 • 61
Textbooks Are All You Need II: phi-1.5 technical report Paper • 2309.05463 • Published Sep 11, 2023 • 84
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent Paper • 2404.03648 • Published Apr 4 • 22
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models Paper • 2404.02258 • Published Apr 2 • 99
MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action Paper • 2303.11381 • Published Mar 20, 2023 • 2
Gorilla: Large Language Model Connected with Massive APIs Paper • 2305.15334 • Published May 24, 2023 • 4
Reflexion: Language Agents with Verbal Reinforcement Learning Paper • 2303.11366 • Published Mar 20, 2023 • 3
Linearity of Relation Decoding in Transformer Language Models Paper • 2308.09124 • Published Aug 17, 2023 • 2
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints Paper • 2305.13245 • Published May 22, 2023 • 5
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework Paper • 2403.13248 • Published Mar 20 • 71
Clembench: Using Game Play to Evaluate Chat-Optimized Language Models as Conversational Agents Paper • 2305.13455 • Published May 22, 2023 • 2
PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models Paper • 2109.05093 • Published Sep 10, 2021 • 1
Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning Paper • 2305.13971 • Published May 23, 2023 • 3
Using Interactive Feedback to Improve the Accuracy and Explainability of Question Answering Systems Post-Deployment Paper • 2204.03025 • Published Apr 6, 2022 • 2
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment Paper • 2403.05135 • Published Mar 8 • 39
LLM as a Judge Collection Curated resources that support the use of LLMs to serve as automatic evaluators of other LLM outputs. • 14 items • Updated Feb 19 • 14