bert-tiny-Massive-intent-KD-distilBERT
This model is a fine-tuned version of google/bert_uncased_L-2_H-128_A-2 on the massive dataset. It achieves the following results on the evaluation set:
- Loss: 1.6612
- Accuracy: 0.8396
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 33
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 50
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
10.9795 | 1.0 | 720 | 9.3236 | 0.2917 |
9.4239 | 2.0 | 1440 | 7.9792 | 0.4092 |
8.2632 | 3.0 | 2160 | 6.9824 | 0.4811 |
7.3425 | 4.0 | 2880 | 6.1545 | 0.5514 |
6.56 | 5.0 | 3600 | 5.4829 | 0.6060 |
5.9032 | 6.0 | 4320 | 4.8994 | 0.6463 |
5.3078 | 7.0 | 5040 | 4.4129 | 0.6911 |
4.819 | 8.0 | 5760 | 4.0152 | 0.7073 |
4.3866 | 9.0 | 6480 | 3.6734 | 0.7324 |
3.9954 | 10.0 | 7200 | 3.3729 | 0.7516 |
3.6764 | 11.0 | 7920 | 3.1251 | 0.7600 |
3.3712 | 12.0 | 8640 | 2.9077 | 0.7752 |
3.1037 | 13.0 | 9360 | 2.7361 | 0.7787 |
2.8617 | 14.0 | 10080 | 2.5791 | 0.7860 |
2.6667 | 15.0 | 10800 | 2.4383 | 0.7944 |
2.476 | 16.0 | 11520 | 2.3301 | 0.7944 |
2.3203 | 17.0 | 12240 | 2.2099 | 0.8052 |
2.1698 | 18.0 | 12960 | 2.1351 | 0.8101 |
2.0563 | 19.0 | 13680 | 2.0554 | 0.8111 |
1.9294 | 20.0 | 14400 | 2.0100 | 0.8190 |
1.8304 | 21.0 | 15120 | 1.9566 | 0.8210 |
1.7315 | 22.0 | 15840 | 1.9076 | 0.8224 |
1.6587 | 23.0 | 16560 | 1.8511 | 0.8283 |
1.5876 | 24.0 | 17280 | 1.8230 | 0.8298 |
1.5173 | 25.0 | 18000 | 1.8002 | 0.8259 |
1.4676 | 26.0 | 18720 | 1.7667 | 0.8278 |
1.3956 | 27.0 | 19440 | 1.7512 | 0.8313 |
1.3436 | 28.0 | 20160 | 1.7233 | 0.8298 |
1.3031 | 29.0 | 20880 | 1.6802 | 0.8318 |
1.2584 | 30.0 | 21600 | 1.6768 | 0.8328 |
1.2233 | 31.0 | 22320 | 1.6612 | 0.8396 |
1.1884 | 32.0 | 23040 | 1.6608 | 0.8352 |
1.1374 | 33.0 | 23760 | 1.6195 | 0.8387 |
1.1299 | 34.0 | 24480 | 1.5969 | 0.8377 |
Framework versions
- Transformers 4.22.1
- Pytorch 1.12.1+cu113
- Datasets 2.5.1
- Tokenizers 0.12.1
- Downloads last month
- 12