Collections
Discover the best community collections!
Collections including paper arxiv:2403.10131
-
Mistral 7B
Paper • 2310.06825 • Published • 45 -
Instruction Tuning with Human Curriculum
Paper • 2310.09518 • Published • 3 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 65 -
Instruction-tuned Language Models are Better Knowledge Learners
Paper • 2402.12847 • Published • 24
-
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 100 -
sDPO: Don't Use Your Data All at Once
Paper • 2403.19270 • Published • 32 -
ViTAR: Vision Transformer with Any Resolution
Paper • 2403.18361 • Published • 49 -
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Paper • 2403.18814 • Published • 42
-
Uni-SMART: Universal Science Multimodal Analysis and Research Transformer
Paper • 2403.10301 • Published • 50 -
Recurrent Drafter for Fast Speculative Decoding in Large Language Models
Paper • 2403.09919 • Published • 20 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 65 -
Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations
Paper • 2403.09704 • Published • 30
-
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Paper • 1701.06538 • Published • 4 -
Attention Is All You Need
Paper • 1706.03762 • Published • 40 -
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Paper • 2005.11401 • Published • 11 -
Language Model Evaluation Beyond Perplexity
Paper • 2106.00085 • Published