A multilingual news corpus built from Common Crawl CC-News, indexed and queriable in milliseconds, cleaned an enriched with language and topic id
Ruggero Marino Lazzaroni
ruggsea
AI & ML interests
NLP in any form
Recent Activity
updated a dataset about 2 hours ago
ruggsea/social-sim-bench-gens published a dataset about 2 hours ago
ruggsea/social-sim-bench-gens updated a collection 7 days ago
Infini-News