-
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 98 -
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Paper • 2404.00399 • Published • 39 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper • 2404.02258 • Published • 100 -
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Paper • 2404.08801 • Published • 61
Collections
Discover the best community collections!
Collections including paper arxiv:2404.00399
-
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Paper • 2404.00399 • Published • 39 -
aurora-m/aurora-m-base
Text Generation • Updated • 5 • 16 -
aurora-m/aurora-m-biden-harris-redteamed
Text Generation • Updated • 9 • 17 -
aurora-m/aurora-m-instruct
Text Generation • Updated • 4 • 11
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Paper • 2403.05530 • Published • 49 -
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Paper • 2404.00399 • Published • 39 -
Rho-1: Not All Tokens Are What You Need
Paper • 2404.07965 • Published • 79
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 5 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 13 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 10 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 62
-
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 50 -
GlórIA -- A Generative and Open Large Language Model for Portuguese
Paper • 2402.12969 • Published -
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Paper • 2404.00399 • Published • 39
-
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 69 -
An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models
Paper • 2309.09958 • Published • 18 -
Noise-Aware Training of Layout-Aware Language Models
Paper • 2404.00488 • Published • 6 -
Streaming Dense Video Captioning
Paper • 2404.01297 • Published • 9