The largest public domain dataset for training LLMs.
PleIAs
company
AI & ML interests
Open Science LLMs
Organization Card
About org cards
PleIAs is a French startup training LLMs with an open science approach.
Collections
1
models
None public yet
datasets
30
PleIAs/post-ocr
Viewer
•
Updated
PleIAs/Openalex-free-PDF
Viewer
•
Updated
PleIAs/openalex_extraction
Viewer
•
Updated
•
2
PleIAs/Common-Corpus-Mini
Updated
PleIAs/openalex-free-license
Viewer
•
Updated
PleIAs/Post-OCR-Correction
Updated
•
287
•
116
PleIAs/YouTube-Commons
Viewer
•
Updated
•
613
•
276
PleIAs/US-PD-Newspapers
Viewer
•
Updated
•
5
•
35
PleIAs/German-PD
Viewer
•
Updated
•
2
•
9
PleIAs/Greek-PD
Viewer
•
Updated