The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources Paper • 2406.16746 • Published Jun 24
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning Paper • 2301.13688 • Published Jan 31, 2023 • 8
MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering Paper • 2007.15207 • Published Jul 30, 2020
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published May 2 • 119
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Paper • 2303.03915 • Published Mar 7, 2023 • 6
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Paper • 2404.00399 • Published Mar 30 • 41
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model Paper • 2402.07827 • Published Feb 12 • 45
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models Paper • 2310.08491 • Published Oct 12, 2023 • 53
OctoPack: Instruction Tuning Code Large Language Models Paper • 2308.07124 • Published Aug 14, 2023 • 28
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code Paper • 2206.11249 • Published Jun 22, 2022
TESS: Text-to-Text Self-Conditioned Simplex Diffusion Paper • 2305.08379 • Published May 15, 2023 • 1
PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts Paper • 2202.01279 • Published Feb 2, 2022
Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP Paper • 2112.10508 • Published Dec 20, 2021
How sensitive are translation systems to extra contexts? Mitigating gender bias in Neural Machine Translation models through relevant contexts Paper • 2205.10762 • Published May 22, 2022