Collections
Discover the best community collections!
Collections including paper arxiv:2311.10768
-
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Paper • 2108.12409 • Published • 5 -
YaRN: Efficient Context Window Extension of Large Language Models
Paper • 2309.00071 • Published • 65 -
MIMIC-IT: Multi-Modal In-Context Instruction Tuning
Paper • 2306.05425 • Published • 11 -
Music ControlNet: Multiple Time-varying Controls for Music Generation
Paper • 2311.07069 • Published • 43
-
Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation
Paper • 2310.18628 • Published • 7 -
TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise
Paper • 2310.19019 • Published • 9 -
Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs
Paper • 2311.02262 • Published • 10 -
Thread of Thought Unraveling Chaotic Contexts
Paper • 2311.08734 • Published • 6
-
Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question Answering
Paper • 2204.04581 • Published • 1 -
Retrieval-Augmented Multimodal Language Modeling
Paper • 2211.12561 • Published • 1 -
When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories
Paper • 2212.10511 • Published • 1 -
Memorizing Transformers
Paper • 2203.08913 • Published • 2
-
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models
Paper • 2310.16795 • Published • 26 -
Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference
Paper • 2308.12066 • Published • 4 -
Towards MoE Deployment: Mitigating Inefficiencies in Mixture-of-Expert (MoE) Inference
Paper • 2303.06182 • Published • 1 -
EvoMoE: An Evolutional Mixture-of-Experts Training Framework via Dense-To-Sparse Gate
Paper • 2112.14397 • Published • 1
-
Text2Control3D: Controllable 3D Avatar Generation in Neural Radiance Fields using Geometry-Guided Text-to-Image Diffusion Model
Paper • 2309.03550 • Published • 11 -
Memory Augmented Language Models through Mixture of Word Experts
Paper • 2311.10768 • Published • 16 -
GAIA: a benchmark for General AI Assistants
Paper • 2311.12983 • Published • 183 -
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Paper • 2311.12631 • Published • 13