Collections
Discover the best community collections!
Collections including paper arxiv:2312.11514
-
TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
Paper • 2404.11912 • Published • 15 -
SnapKV: LLM Knows What You are Looking for Before Generation
Paper • 2404.14469 • Published • 23 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 252
-
Towards a World-English Language Model for On-Device Virtual Assistants
Paper • 2403.18783 • Published • 4 -
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 119 -
ReALM: Reference Resolution As Language Modeling
Paper • 2403.20329 • Published • 20 -
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Paper • 2404.05719 • Published • 56
-
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 80 -
Sensor-based Multi-Robot Search and Coverage with Spatial Separation in Unstructured Environments
Paper • 2403.01710 • Published • 2 -
EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models
Paper • 2308.14352 • Published -
Slimmable Encoders for Flexible Split DNNs in Bandwidth and Resource Constrained IoT Systems
Paper • 2306.12691 • Published • 2
-
mistralai/Mixtral-8x7B-Instruct-v0.1
Text Generation • Updated • 561k • 3.81k -
HuggingFaceM4/WebSight
Viewer • Updated • 186 • 282 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 252 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 235