bert-base-multilingual-cased-finetuned-nli
This model is a fine-tuned version of bert-base-multilingual-cased on the xnli dataset. It achieves the following results on the evaluation set:
- Loss: 0.4681
- Accuracy: 0.8157
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 2
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
0.9299 | 0.02 | 200 | 0.8468 | 0.6277 |
0.7967 | 0.03 | 400 | 0.7425 | 0.6855 |
0.7497 | 0.05 | 600 | 0.7116 | 0.6924 |
0.7083 | 0.07 | 800 | 0.6868 | 0.7153 |
0.6882 | 0.08 | 1000 | 0.6638 | 0.7289 |
0.6944 | 0.1 | 1200 | 0.6476 | 0.7361 |
0.6682 | 0.11 | 1400 | 0.6364 | 0.7458 |
0.6635 | 0.13 | 1600 | 0.6592 | 0.7337 |
0.6423 | 0.15 | 1800 | 0.6120 | 0.7510 |
0.6196 | 0.16 | 2000 | 0.5990 | 0.7582 |
0.6381 | 0.18 | 2200 | 0.6026 | 0.7538 |
0.6276 | 0.2 | 2400 | 0.6054 | 0.7598 |
0.6248 | 0.21 | 2600 | 0.6368 | 0.7526 |
0.6331 | 0.23 | 2800 | 0.5959 | 0.7655 |
0.6142 | 0.24 | 3000 | 0.6117 | 0.7554 |
0.6124 | 0.26 | 3200 | 0.6221 | 0.7570 |
0.6127 | 0.28 | 3400 | 0.5748 | 0.7695 |
0.602 | 0.29 | 3600 | 0.5735 | 0.7598 |
0.5923 | 0.31 | 3800 | 0.5609 | 0.7723 |
0.5827 | 0.33 | 4000 | 0.5635 | 0.7743 |
0.5732 | 0.34 | 4200 | 0.5547 | 0.7771 |
0.5757 | 0.36 | 4400 | 0.5629 | 0.7739 |
0.5736 | 0.37 | 4600 | 0.5680 | 0.7659 |
0.5642 | 0.39 | 4800 | 0.5437 | 0.7871 |
0.5763 | 0.41 | 5000 | 0.5589 | 0.7807 |
0.5713 | 0.42 | 5200 | 0.5355 | 0.7867 |
0.5644 | 0.44 | 5400 | 0.5346 | 0.7888 |
0.5727 | 0.46 | 5600 | 0.5519 | 0.7815 |
0.5539 | 0.47 | 5800 | 0.5219 | 0.7900 |
0.5516 | 0.49 | 6000 | 0.5560 | 0.7795 |
0.5539 | 0.51 | 6200 | 0.5544 | 0.7847 |
0.5693 | 0.52 | 6400 | 0.5322 | 0.7932 |
0.5632 | 0.54 | 6600 | 0.5404 | 0.7936 |
0.565 | 0.55 | 6800 | 0.5382 | 0.7880 |
0.5555 | 0.57 | 7000 | 0.5364 | 0.7920 |
0.5329 | 0.59 | 7200 | 0.5177 | 0.7964 |
0.54 | 0.6 | 7400 | 0.5286 | 0.7916 |
0.554 | 0.62 | 7600 | 0.5401 | 0.7835 |
0.5447 | 0.64 | 7800 | 0.5261 | 0.7876 |
0.5438 | 0.65 | 8000 | 0.5032 | 0.8020 |
0.5505 | 0.67 | 8200 | 0.5220 | 0.7924 |
0.5364 | 0.68 | 8400 | 0.5398 | 0.7876 |
0.5317 | 0.7 | 8600 | 0.5310 | 0.7944 |
0.5361 | 0.72 | 8800 | 0.5297 | 0.7936 |
0.5204 | 0.73 | 9000 | 0.5270 | 0.7940 |
0.5189 | 0.75 | 9200 | 0.5193 | 0.7964 |
0.5348 | 0.77 | 9400 | 0.5270 | 0.7867 |
0.5363 | 0.78 | 9600 | 0.5194 | 0.7924 |
0.5184 | 0.8 | 9800 | 0.5298 | 0.7888 |
0.5072 | 0.81 | 10000 | 0.4999 | 0.7992 |
0.5229 | 0.83 | 10200 | 0.4922 | 0.8108 |
0.5201 | 0.85 | 10400 | 0.5019 | 0.7920 |
0.5304 | 0.86 | 10600 | 0.4959 | 0.7992 |
0.5061 | 0.88 | 10800 | 0.5047 | 0.7980 |
0.5291 | 0.9 | 11000 | 0.4974 | 0.8068 |
0.5099 | 0.91 | 11200 | 0.4988 | 0.8036 |
0.5271 | 0.93 | 11400 | 0.4899 | 0.8028 |
0.5211 | 0.95 | 11600 | 0.4866 | 0.8092 |
0.4977 | 0.96 | 11800 | 0.5059 | 0.7960 |
0.5155 | 0.98 | 12000 | 0.4821 | 0.8084 |
0.5061 | 0.99 | 12200 | 0.4763 | 0.8116 |
0.4607 | 1.01 | 12400 | 0.5245 | 0.8020 |
0.4435 | 1.03 | 12600 | 0.5021 | 0.8032 |
0.4289 | 1.04 | 12800 | 0.5219 | 0.8060 |
0.4227 | 1.06 | 13000 | 0.5119 | 0.8076 |
0.4349 | 1.08 | 13200 | 0.4957 | 0.8104 |
0.4331 | 1.09 | 13400 | 0.4914 | 0.8129 |
0.4269 | 1.11 | 13600 | 0.4785 | 0.8145 |
0.4185 | 1.12 | 13800 | 0.4879 | 0.8161 |
0.4244 | 1.14 | 14000 | 0.4834 | 0.8149 |
0.4016 | 1.16 | 14200 | 0.5084 | 0.8056 |
0.4106 | 1.17 | 14400 | 0.4993 | 0.8052 |
0.4345 | 1.19 | 14600 | 0.5029 | 0.8124 |
0.4162 | 1.21 | 14800 | 0.4841 | 0.8120 |
0.4239 | 1.22 | 15000 | 0.4756 | 0.8189 |
0.4215 | 1.24 | 15200 | 0.4957 | 0.8088 |
0.4157 | 1.25 | 15400 | 0.4845 | 0.8112 |
0.3982 | 1.27 | 15600 | 0.5064 | 0.8048 |
0.4056 | 1.29 | 15800 | 0.4827 | 0.8241 |
0.4105 | 1.3 | 16000 | 0.4936 | 0.8088 |
0.4221 | 1.32 | 16200 | 0.4800 | 0.8129 |
0.4029 | 1.34 | 16400 | 0.4790 | 0.8181 |
0.4346 | 1.35 | 16600 | 0.4802 | 0.8137 |
0.4163 | 1.37 | 16800 | 0.4838 | 0.8213 |
0.4106 | 1.39 | 17000 | 0.4905 | 0.8209 |
0.4071 | 1.4 | 17200 | 0.4889 | 0.8153 |
0.4077 | 1.42 | 17400 | 0.4801 | 0.8165 |
0.4074 | 1.43 | 17600 | 0.4765 | 0.8217 |
0.4095 | 1.45 | 17800 | 0.4942 | 0.8096 |
0.4117 | 1.47 | 18000 | 0.4668 | 0.8225 |
0.3991 | 1.48 | 18200 | 0.4814 | 0.8161 |
0.4114 | 1.5 | 18400 | 0.4757 | 0.8193 |
0.4061 | 1.52 | 18600 | 0.4702 | 0.8209 |
0.4104 | 1.53 | 18800 | 0.4814 | 0.8149 |
0.3997 | 1.55 | 19000 | 0.4833 | 0.8141 |
0.3992 | 1.56 | 19200 | 0.4847 | 0.8169 |
0.4021 | 1.58 | 19400 | 0.4893 | 0.8189 |
0.4284 | 1.6 | 19600 | 0.4806 | 0.8173 |
0.3915 | 1.61 | 19800 | 0.4952 | 0.8092 |
0.4122 | 1.63 | 20000 | 0.4917 | 0.8112 |
0.4164 | 1.65 | 20200 | 0.4769 | 0.8157 |
0.4063 | 1.66 | 20400 | 0.4723 | 0.8141 |
0.4087 | 1.68 | 20600 | 0.4701 | 0.8157 |
0.4159 | 1.69 | 20800 | 0.4826 | 0.8141 |
0.4 | 1.71 | 21000 | 0.4760 | 0.8133 |
0.4024 | 1.73 | 21200 | 0.4755 | 0.8161 |
0.4201 | 1.74 | 21400 | 0.4728 | 0.8173 |
0.4066 | 1.76 | 21600 | 0.4690 | 0.8157 |
0.3941 | 1.78 | 21800 | 0.4687 | 0.8181 |
0.3987 | 1.79 | 22000 | 0.4735 | 0.8149 |
0.4074 | 1.81 | 22200 | 0.4715 | 0.8137 |
0.4083 | 1.83 | 22400 | 0.4660 | 0.8181 |
0.4107 | 1.84 | 22600 | 0.4699 | 0.8161 |
0.3924 | 1.86 | 22800 | 0.4732 | 0.8153 |
0.4205 | 1.87 | 23000 | 0.4686 | 0.8177 |
0.3962 | 1.89 | 23200 | 0.4688 | 0.8177 |
0.3888 | 1.91 | 23400 | 0.4778 | 0.8124 |
0.3978 | 1.92 | 23600 | 0.4713 | 0.8145 |
0.3963 | 1.94 | 23800 | 0.4704 | 0.8145 |
0.408 | 1.96 | 24000 | 0.4674 | 0.8165 |
0.4014 | 1.97 | 24200 | 0.4679 | 0.8161 |
0.3951 | 1.99 | 24400 | 0.4681 | 0.8157 |
Framework versions
- Transformers 4.21.0
- Pytorch 1.12.0+cu102
- Datasets 2.4.0
- Tokenizers 0.12.1
- Downloads last month
- 35
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.