For fineweb-edu in korean
devngho PRO
devngho
AI & ML interests
Efficient Korean NLP, Fine Korean datasets
Recent Activity
new activity
2 days ago
devngho/instruction-mix:[bot] Conversion to Parquet
updated
a dataset
3 days ago
devngho/instruction-mix
published
a dataset
3 days ago
devngho/instruction-mix
Organizations
Collections
3
spaces
1
models
35

devngho/gaenari-phi-4-pt-preview
Text Generation
•
Updated
•
61

devngho/phi4-mini-jamo-tokenizer
Updated

devngho/phi4-jamo-tokenizer
Updated
•
1

devngho/phi-4-jamo-init
Text Generation
•
Updated
•
52

devngho/llama-ablation-large-korean-corpus-jamo
Text Generation
•
Updated

devngho/llama-ablation-large-korean-corpus_edu-jamo-sft
Text Generation
•
Updated

devngho/jamo-tokenizer-exp1-chatml
Updated

devngho/llama-ablation-large-korean-corpus_edu-sft
Text Generation
•
Updated

devngho/llama-ablation-large-korean-corpus_edu-jamo
Text Generation
•
Updated

devngho/non-jamo-tokenizer-exp1-chatml
Updated
datasets
19
devngho/instruction-mix
Viewer
•
Updated
•
1.44M
•
8
devngho/gaenari-assets
Viewer
•
Updated
•
1
•
29
devngho/the-stack-mini-edu
Viewer
•
Updated
•
832k
•
202
devngho/korean-instruction-mix
Viewer
•
Updated
•
516k
•
17
•
1
devngho/the-stack-llm-annotations-v2
Viewer
•
Updated
•
1.89M
•
89
devngho/korean-webtext-edu
Viewer
•
Updated
•
1.98M
•
13
•
1
devngho/korean-textbooks-edu
Viewer
•
Updated
•
10.1M
•
14
•
1
devngho/korean-wikipedia-edu
Viewer
•
Updated
•
605k
•
28
•
1
devngho/the_stack_llm_annotations
Viewer
•
Updated
•
1.89M
•
8
•
3
devngho/ko_llm_annotations
Viewer
•
Updated
•
1.55M
•
13
•
2