kd-shared/fineweb-CC-MAIN-2023-50-and-CC-MAIN-2024-10-meta-llama_Llama-2-7b-hf Updated May 19, 2024 • 35
Evaluating the Ripple Effects of Knowledge Editing in Language Models Paper • 2307.12976 • Published Jul 24, 2023 • 12
kd-shared/culturax-ar-spbpe32k-focus-embs-anneal-bf16-mixed-xassy-final Text Generation • Updated Jun 25, 2024 • 6
FOCUS: Effective Embedding Initialization for Specializing Pretrained Multilingual Models on a Single Language Paper • 2305.14481 • Published May 23, 2023 • 1
Efficient Parallelization Layouts for Large-Scale Distributed Model Training Paper • 2311.05610 • Published Nov 9, 2023