Train-Test Set: "intent-multilabel-v1-2.zip"

Model: "dbmdz/bert-base-turkish-cased"

Tokenizer Params

max_length=128
padding="max_length"
truncation=True

Training Params

evaluation_strategy = "epoch"
save_strategy = "epoch"
per_device_train_batch_size = 16
per_device_eval_batch_size = 16
num_train_epochs = 4
load_best_model_at_end = True

Train-Val Splitting Configuration

train_test_split(df_train,
                 test_size=0.1,
                 random_state=1111)

Class Loss Weights

  • Alakasiz: 1.0
  • Barinma: 1.5167249178108022
  • Elektronik: 1.7547338578655642
  • Giysi: 1.9610520059358458
  • Kurtarma: 1.269341370129623
  • Lojistik: 1.8684086209021484
  • Saglik: 1.8019018017117145
  • Su: 2.110648663094536
  • Yagma: 3.081208739200435
  • Yemek: 1.7994815143101963

Training Log (Class-Scaled)

Epoch	Training Loss	Validation Loss
1	    No log	        0.216295
2	    0.260000	    0.171498
3	    0.142700	    0.175608
4	    0.142700	    0.169851

Threshold Optimization

  • Best Threshold: 0.15
  • F1 @ Threshold: 0.7503

Eval Results

              precision    recall  f1-score   support

    Alakasiz       0.91      0.87      0.89       734
     Barinma       0.85      0.81      0.83       207
  Elektronik       0.72      0.78      0.75       130
       Giysi       0.73      0.67      0.70        94
    Kurtarma       0.86      0.81      0.83       362
    Lojistik       0.68      0.56      0.62       112
      Saglik       0.72      0.81      0.76       108
          Su       0.61      0.69      0.65        78
       Yagma       0.67      0.65      0.66        31
       Yemek       0.79      0.85      0.82       117

   micro avg       0.82      0.81      0.81      1973
   macro avg       0.75      0.75      0.75      1973
weighted avg       0.83      0.81      0.81      1973
 samples avg       0.84      0.84      0.83      1973
Downloads last month
15
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results