Loubna Ben Allal's picture

Loubna Ben Allal

loubnabnl

·

https://loubnabnl.github.io/

AI & ML interests

SmolLMs, ML for code, data

Recent Activity

liked a dataset 2 days ago

open-r1/codeforces-cots

upvoted an article 2 days ago

Open R1: Update #3

published an article 2 days ago

Open R1: Update #3

View all activity

Organizations

loubnabnl's activity

authored a paper about 1 month ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 203

authored a paper 9 months ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25, 2024 • 93

authored a paper 10 months ago

Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations

Paper • 2405.18392 • Published May 28, 2024 • 12

authored a paper about 1 year ago

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 138

authored 5 papers almost 2 years ago

StarCoder: may the source be with you!

Paper • 2305.06161 • Published May 9, 2023 • 31

The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset

Paper • 2303.03915 • Published Mar 7, 2023 • 7

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Paper • 2211.05100 • Published Nov 9, 2022 • 29

The Stack: 3 TB of permissively licensed source code

Paper • 2211.15533 • Published Nov 20, 2022 • 5

SantaCoder: don't reach for the stars!

Paper • 2301.03988 • Published Jan 9, 2023 • 7