MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Paper • 2405.12130 • Published 1 day ago • 30
SUTRA: Scalable Multilingual Language Model Architecture Paper • 2405.06694 • Published 14 days ago • 34
You Only Cache Once: Decoder-Decoder Architectures for Language Models Paper • 2405.05254 • Published 13 days ago • 5
view article Article SeeMoE: Implementing a MoE Vision Language Model from Scratch By AviSoori1x • 15 days ago • 24
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published 23 days ago • 109
FLAME: Factuality-Aware Alignment for Large Language Models Paper • 2405.01525 • Published 19 days ago • 21
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published 19 days ago • 94
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores Paper • 2311.05908 • Published Nov 10, 2023 • 11
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published 29 days ago • 120
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study Paper • 2404.14047 • Published 30 days ago • 37
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions Paper • 2404.13208 • Published Apr 19 • 37
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published 29 days ago • 230
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated 15 days ago • 77
Computer Vision Backbones 🧩 Collection Collection of useful computer vision backbones to fine-tune. It also includes large image classification models, that can be used as backbone. • 22 items • Updated Sep 19, 2023 • 12
Awesome Document AI Collection A collection of open-source document AI 📄 📝 📈 • 27 items • Updated Mar 11 • 38
Vision Language Models Papers 🖼️💬📝 Collection Papers about vision-language models, most important ones are on top of the list. • 27 items • Updated 21 days ago • 26
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints Paper • 2305.13245 • Published May 22, 2023 • 5
FABLES: Evaluating faithfulness and content selection in book-length summarization Paper • 2404.01261 • Published Apr 1 • 3
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published Apr 10 • 92
Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models Paper • 2403.20331 • Published Mar 29 • 14
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model Paper • 2402.03766 • Published Feb 6 • 9
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper • 2403.09611 • Published Mar 14 • 119
A Touch, Vision, and Language Dataset for Multimodal Alignment Paper • 2402.13232 • Published Feb 20 • 11
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models Paper • 2402.13064 • Published Feb 20 • 45
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model Paper • 2402.07827 • Published Feb 12 • 43
TURNA: A Turkish Encoder-Decoder Language Model for Enhanced Understanding and Generation Paper • 2401.14373 • Published Jan 25 • 11
LLaMA Beyond English: An Empirical Study on Language Capability Transfer Paper • 2401.01055 • Published Jan 2 • 50
PersianMind: A Cross-Lingual Persian-English Large Language Model Paper • 2401.06466 • Published Jan 12 • 2
A Simple Framework to Accelerate Multilingual Language Model for Monolingual Text Generation Paper • 2401.10660 • Published Jan 19 • 2
MaLA-500: Massive Language Adaptation of Large Language Models Paper • 2401.13303 • Published Jan 24 • 11
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 131
DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines Paper • 2312.13382 • Published Dec 20, 2023 • 2
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws Paper • 2401.00448 • Published Dec 31, 2023 • 25
Improving Text Embeddings with Large Language Models Paper • 2401.00368 • Published Dec 31, 2023 • 72
Understanding LLMs: A Comprehensive Overview from Training to Inference Paper • 2401.02038 • Published Jan 4 • 59
DocGraphLM: Documental Graph Language Model for Information Extraction Paper • 2401.02823 • Published Jan 5 • 32