base_model:
- microsoft/codebert-base
datasets:
- devngho/the-stack-llm-annotations-v2
language:
- code
library_name: transformers
license: mit
metrics:
- f1
devngho/code_edu_classifier-v3-microsoft_codebert-base
์ด ๋ชจ๋ธ์ microsoft/codebert-base์ classifier๋ฅผ ์ถ๊ฐํ ๋ชจ๋ธ์ ๋๋ค. HuggingFaceFW/fineweb-edu-classifier์ ์ฝ๋ ๋ฒ์ ์ ๋ชฉํ๋ก, ์ฝ๋์ ๊ต์ก์ฑ ์ ์๋ฅผ ํ๊ฐํฉ๋๋ค. ํ์ต์๋ bigcode/the-stack-dedup์์ ์ถ์ถํ ์ํ์ Qwen/Qwen2.5-Coder-32B-Instruct๋ก ํ๊ฐํ devngho/the-stack-llm-annotations-v2 ๋ฐ์ดํฐ์ ์ด ์ฌ์ฉ๋์์ต๋๋ค.
์ด ์ฐ๊ตฌ๋ Google์ TPU Research Cloud (TRC)์ Cloud TPU ์ ๊ณต์ผ๋ก ์ํ๋์์ต๋๋ค. โก
์์ธ
- ์ ์: devngho
- ์ธ์ด: code
- ๋ผ์ด์ ์ค: mit
- ๊ธฐ๋ฐ ๋ชจ๋ธ: microsoft/codebert-base
ํ์ต ์์ธ
- learning_rate: 3e-4 (cosine)
- warmup_ratio: 0.1
- batch_size: 2048(512*4)
- optimizer: adamw(b1=0.9, b2=0.98, eps=1e-8, weight_decay=0.01)
- duration: 4h 41m
- steps: 6080
ํ์ต ์ฅ๋น
TPU v4-8
์ฑ๋ฅ
Validation Report:
precision recall f1-score support
0 0.80 0.06 0.10 72
1 0.62 0.40 0.48 835
2 0.61 0.62 0.61 2722
3 0.48 0.72 0.58 1891
4 0.62 0.02 0.05 623
5 0.00 0.00 0.00 1
accuracy 0.55 6144
macro avg 0.52 0.30 0.30 6144
weighted avg 0.58 0.55 0.52 6144
Confusion Matrix:
[[ 4 36 30 2 0 0]
[ 1 330 464 40 0 0]
[ 0 157 1684 881 0 0]
[ 0 5 516 1361 9 0]
[ 0 0 71 537 15 0]
[ 0 0 0 1 0 0]]
3 ์ด์๊ณผ ๋ฏธ๋ง์ผ๋ก ๊ตฌ๋ถํ ๋ f1 score๋ ์ฝ 0.72์ ๋๋ค.
devngho/code_edu_classifier-v3-microsoft_codebert-base
This model is microsoft/codebert-base with classfier head. It is designed to evaluate the educational value of codes, similar to the HuggingFaceFW/fineweb-edu-classifier, but focused on code. The training data comes from devngho/the-stack-llm-annotations-v2 dataset, contains samples extracted from bigcode/the-stack-dedup and evaluated using Qwen/Qwen2.5-Coder-32B-Instruct.
This research was supported with Cloud TPUs from Google's TPU Research Cloud (TRC).โก
- Developed by: devngho
- Language(s): code
- License: mit
- Base model: microsoft/codebert-base
Training detail
- learning_rate: 3e-4 (cosine)
- warmup_ratio: 0.1
- batch_size: 2048(512*4)
- optimizer: adamw(b1=0.9, b2=0.98, eps=1e-8, weight_decay=0.01)
- duration: 4h 41m
- steps: 6080
Training hardware
TPU v4-8
Performance
Validation Report:
precision recall f1-score support
0 0.80 0.06 0.10 72
1 0.62 0.40 0.48 835
2 0.61 0.62 0.61 2722
3 0.48 0.72 0.58 1891
4 0.62 0.02 0.05 623
5 0.00 0.00 0.00 1
accuracy 0.55 6144
macro avg 0.52 0.30 0.30 6144
weighted avg 0.58 0.55 0.52 6144
Confusion Matrix:
[[ 4 36 30 2 0 0]
[ 1 330 464 40 0 0]
[ 0 157 1684 881 0 0]
[ 0 5 516 1361 9 0]
[ 0 0 71 537 15 0]
[ 0 0 0 1 0 0]]
The F1 score is about 0.72 when separating above and below 3.