gokuls's picture
End of training
7fe0344
metadata
license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - massive
metrics:
  - accuracy
model-index:
  - name: bert-tiny-Massive-intent-KD-BERT
    results:
      - task:
          name: Text Classification
          type: text-classification
        dataset:
          name: massive
          type: massive
          config: en-US
          split: train
          args: en-US
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.853418593212002

bert-tiny-Massive-intent-KD-BERT

This model is a fine-tuned version of google/bert_uncased_L-2_H-128_A-2 on the massive dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8380
  • Accuracy: 0.8534

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 33
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
5.83 1.0 720 4.8826 0.3050
4.7602 2.0 1440 3.9904 0.4191
4.0301 3.0 2160 3.3806 0.5032
3.4797 4.0 2880 2.9065 0.5967
3.0352 5.0 3600 2.5389 0.6596
2.6787 6.0 4320 2.2342 0.7044
2.3644 7.0 5040 1.9873 0.7354
2.1145 8.0 5760 1.7928 0.7462
1.896 9.0 6480 1.6293 0.7644
1.7138 10.0 7200 1.5062 0.7752
1.5625 11.0 7920 1.3923 0.7885
1.4229 12.0 8640 1.3092 0.7978
1.308 13.0 9360 1.2364 0.8018
1.201 14.0 10080 1.1759 0.8155
1.1187 15.0 10800 1.1322 0.8214
1.0384 16.0 11520 1.0990 0.8234
0.976 17.0 12240 1.0615 0.8308
0.9163 18.0 12960 1.0377 0.8328
0.8611 19.0 13680 1.0054 0.8337
0.812 20.0 14400 0.9926 0.8367
0.7721 21.0 15120 0.9712 0.8382
0.7393 22.0 15840 0.9586 0.8357
0.7059 23.0 16560 0.9428 0.8372
0.6741 24.0 17280 0.9377 0.8396
0.6552 25.0 18000 0.9229 0.8377
0.627 26.0 18720 0.9100 0.8416
0.5972 27.0 19440 0.9028 0.8416
0.5784 28.0 20160 0.8996 0.8406
0.5595 29.0 20880 0.8833 0.8451
0.5438 30.0 21600 0.8772 0.8475
0.5218 31.0 22320 0.8758 0.8451
0.509 32.0 23040 0.8728 0.8480
0.4893 33.0 23760 0.8640 0.8480
0.4948 34.0 24480 0.8541 0.8475
0.4722 35.0 25200 0.8595 0.8495
0.468 36.0 25920 0.8488 0.8495
0.4517 37.0 26640 0.8460 0.8505
0.4462 38.0 27360 0.8450 0.8485
0.4396 39.0 28080 0.8422 0.8490
0.427 40.0 28800 0.8380 0.8534
0.4287 41.0 29520 0.8385 0.8480
0.4222 42.0 30240 0.8319 0.8510
0.421 43.0 30960 0.8296 0.8510

Framework versions

  • Transformers 4.22.1
  • Pytorch 1.12.1+cu113
  • Datasets 2.5.1
  • Tokenizers 0.12.1