CroissantLLM: A Truly Bilingual French-English Language Model Paper • 2402.00786 • Published Feb 1 • 23
Common Corpus Collection The largest public domain dataset for training LLMs. • 27 items • Updated 3 days ago • 107