Metacognitive Prompting Improves Understanding in Large Language Models Paper • 2308.05342 • Published Aug 10, 2023 • 2
Large Language Models Struggle to Learn Long-Tail Knowledge Paper • 2211.08411 • Published Nov 15, 2022 • 3
view article Article Introducing Spaces Dev Mode for a seamless developer experience 13 days ago • 10
Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy Paper • 2305.15294 • Published May 24, 2023 • 1
No Language Left Behind: Scaling Human-Centered Machine Translation Paper • 2207.04672 • Published Jul 11, 2022 • 1
view article Article Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints May 1 • 53
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering Paper • 1809.09600 • Published Sep 25, 2018 • 2
What matters when building vision-language models? Paper • 2405.02246 • Published about 1 month ago • 87
Fast Inference from Transformers via Speculative Decoding Paper • 2211.17192 • Published Nov 30, 2022 • 3
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads Paper • 2401.10774 • Published Jan 19 • 50
Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding Paper • 2402.12374 • Published Feb 19 • 2
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 239
AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation Paper • 2404.12753 • Published Apr 19 • 38
Training-Free Long-Context Scaling of Large Language Models Paper • 2402.17463 • Published Feb 27 • 18
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Apr 22 • 73
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Apr 18 • 563
Hydragen: High-Throughput LLM Inference with Shared Prefixes Paper • 2402.05099 • Published Feb 7 • 17
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15 • 134
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale Paper • 2208.07339 • Published Aug 15, 2022 • 4
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers Paper • 2210.17323 • Published Oct 31, 2022 • 6
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published Apr 10 • 93
view article Article Assisted Generation: a new direction toward low-latency text generation May 11, 2023 • 9
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders Paper • 2404.05961 • Published Apr 9 • 62
Textbooks Are All You Need II: phi-1.5 technical report Paper • 2309.05463 • Published Sep 11, 2023 • 84
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent Paper • 2404.03648 • Published Apr 4 • 22
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models Paper • 2404.02258 • Published Apr 2 • 102