HuggingFaceFW/fineweb-edu
Viewer • Updated • 3.5B • 520k • 1.11k
Datasets
Training Data: The model was trained using FineWeb-Edu for English and FineWeb2 for Korean.
Validation Data: wikitext (English) and wikipedia (Korean) were used for evaluation and validation purposes.
Tokenizer