File size: 1,296 Bytes
b0cf554 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
**Train-Test Set:** "teknofest_train_final.csv"
**Model:** "dbmdz/bert-base-turkish-128k-uncased"
**Önişleme**
- Karakterler küçültülmüştür
- Noktalama işaretleri silinmiştir
## Tokenizer Parametreleri
```
max_length=64
padding=True
truncation=True
```
## Eğitim Parametreleri
- **Epoch:** 3
- **Learning Rate:** 7e-5
- **Batch-Size:** 64
- **Tokenizer Length:** 64
- **Loss:** BCE
- **Online Hard Example Mining:** Açık
- **Class-Weighting:** Açık (^0.3)
- **Early Stopping:** Kapalı
- **Stratified Batch Sampling:** Açık
- **Gradient Accumulation:** Kapalı
- **LR Scheduler:** Cosine-with-Warmup
- **Warmup Ratio:** 0.1
- **Weight Decay:** 0.01
- **LLRD:** 0.95
- **Label Smoothing:** 0.05
- **Gradient Clipping:** 1.0
- **MLM Pre-Training:** Kapalı
## CV10 Sonuçları
```
precision recall f1-score support
INSULT 0.9098 0.9143 0.9120 2393
OTHER 0.9596 0.9481 0.9538 3528
PROFANITY 0.9599 0.9575 0.9587 2376
RACIST 0.9551 0.9636 0.9594 2033
SEXIST 0.9552 0.9635 0.9593 2081
accuracy 0.9485 12411
macro avg 0.9479 0.9494 0.9486 12411
weighted avg 0.9486 0.9485 0.9485 12411
``` |