Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper β’ 2412.13663 β’ Published Dec 18, 2024 β’ 132
Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation Paper β’ 2412.03304 β’ Published Dec 4, 2024 β’ 18
Open Language Data Initiative: Advancing Low-Resource Machine Translation for Karakalpak Paper β’ 2409.04269 β’ Published Sep 6, 2024 β’ 10
dilmash release Collection Dilmash: Karakalpak Machine Translation β’ 5 items β’ Updated Sep 10, 2024 β’ 2
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset Paper β’ 2309.04662 β’ Published Sep 9, 2023 β’ 23