distilbert_base_finetuned
This model is a fine-tuned version of distilbert-base-uncased on the English subset of pii200k dataset. It achieves the following results on the evaluation set:
- Loss: 0.1096
- Overall Precision: 0.8992
- Overall Recall: 0.9251
- Overall F1: 0.9120
- Overall Accuracy: 0.9546
- Accountname F1: 0.9861
- Accountnumber F1: 0.9809
- Age F1: 0.9202
- Amount F1: 0.9408
- Bic F1: 0.8869
- Bitcoinaddress F1: 0.9502
- Buildingnumber F1: 0.8860
- City F1: 0.9207
- Companyname F1: 0.9693
- County F1: 0.9725
- Creditcardcvv F1: 0.9107
- Creditcardissuer F1: 0.9872
- Creditcardnumber F1: 0.8675
- Currency F1: 0.7147
- Currencycode F1: 0.6585
- Currencyname F1: 0.0123
- Currencysymbol F1: 0.8368
- Date F1: 0.8193
- Dob F1: 0.5701
- Email F1: 0.9953
- Ethereumaddress F1: 0.9877
- Eyecolor F1: 0.9302
- Firstname F1: 0.9602
- Gender F1: 0.9568
- Height F1: 0.9695
- Iban F1: 0.9751
- Ip F1: 0.0
- Ipv4 F1: 0.8265
- Ipv6 F1: 0.7527
- Jobarea F1: 0.9133
- Jobtitle F1: 0.9728
- Jobtype F1: 0.9297
- Lastname F1: 0.9333
- Litecoinaddress F1: 0.8225
- Mac F1: 0.9957
- Maskednumber F1: 0.8108
- Middlename F1: 0.9247
- Nearbygpscoordinate F1: 1.0
- Ordinaldirection F1: 0.9533
- Password F1: 0.9174
- Phoneimei F1: 0.9862
- Phonenumber F1: 0.9759
- Pin F1: 0.8829
- Prefix F1: 0.9340
- Secondaryaddress F1: 0.9829
- Sex F1: 0.9791
- Ssn F1: 0.9703
- State F1: 0.9521
- Street F1: 0.9349
- Time F1: 0.9816
- Url F1: 0.9982
- Useragent F1: 0.9813
- Username F1: 0.9743
- Vehiclevin F1: 0.9712
- Vehiclevrm F1: 0.9526
- Zipcode F1: 0.8184
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.2
- num_epochs: 7
Training results
Training Loss | Epoch | Step | Validation Loss | Overall Precision | Overall Recall | Overall F1 | Overall Accuracy | Accountname F1 | Accountnumber F1 | Age F1 | Amount F1 | Bic F1 | Bitcoinaddress F1 | Buildingnumber F1 | City F1 | Companyname F1 | County F1 | Creditcardcvv F1 | Creditcardissuer F1 | Creditcardnumber F1 | Currency F1 | Currencycode F1 | Currencyname F1 | Currencysymbol F1 | Date F1 | Dob F1 | Email F1 | Ethereumaddress F1 | Eyecolor F1 | Firstname F1 | Gender F1 | Height F1 | Iban F1 | Ip F1 | Ipv4 F1 | Ipv6 F1 | Jobarea F1 | Jobtitle F1 | Jobtype F1 | Lastname F1 | Litecoinaddress F1 | Mac F1 | Maskednumber F1 | Middlename F1 | Nearbygpscoordinate F1 | Ordinaldirection F1 | Password F1 | Phoneimei F1 | Phonenumber F1 | Pin F1 | Prefix F1 | Secondaryaddress F1 | Sex F1 | Ssn F1 | State F1 | Street F1 | Time F1 | Url F1 | Useragent F1 | Username F1 | Vehiclevin F1 | Vehiclevrm F1 | Zipcode F1 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.4764 | 1.0 | 1088 | 0.2240 | 0.6718 | 0.7532 | 0.7102 | 0.9283 | 0.8807 | 0.9560 | 0.7916 | 0.6034 | 0.4684 | 0.8385 | 0.6515 | 0.6041 | 0.8988 | 0.6165 | 0.2137 | 0.7101 | 0.6661 | 0.3774 | 0.0 | 0.0 | 0.4411 | 0.7095 | 0.1332 | 0.9859 | 0.9712 | 0.4963 | 0.8349 | 0.6953 | 0.8675 | 0.9045 | 0.0018 | 0.0484 | 0.7792 | 0.5532 | 0.7598 | 0.6803 | 0.7476 | 0.4354 | 0.9806 | 0.5663 | 0.1526 | 0.9985 | 0.8345 | 0.7584 | 0.9741 | 0.9326 | 0.1657 | 0.9104 | 0.8907 | 0.8920 | 0.8820 | 0.4878 | 0.6348 | 0.9580 | 0.9759 | 0.9398 | 0.9054 | 0.7335 | 0.5931 | 0.5893 |
0.1476 | 2.0 | 2176 | 0.1248 | 0.8445 | 0.9023 | 0.8725 | 0.9494 | 0.9653 | 0.9700 | 0.9177 | 0.9124 | 0.9003 | 0.9273 | 0.8761 | 0.9196 | 0.9694 | 0.9537 | 0.8958 | 0.9825 | 0.8528 | 0.6293 | 0.4828 | 0.0 | 0.7793 | 0.8291 | 0.5297 | 0.9882 | 0.9758 | 0.9064 | 0.9353 | 0.9426 | 0.9759 | 0.9313 | 0.0288 | 0.6916 | 0.4490 | 0.8870 | 0.9542 | 0.9176 | 0.8924 | 0.7650 | 0.9871 | 0.6870 | 0.8530 | 1.0 | 0.9469 | 0.9526 | 0.9890 | 0.9447 | 0.8103 | 0.9261 | 0.9694 | 0.9684 | 0.9611 | 0.9417 | 0.8784 | 0.9660 | 0.9973 | 0.9657 | 0.9639 | 0.9744 | 0.9617 | 0.8035 |
0.0959 | 3.0 | 3264 | 0.1096 | 0.8992 | 0.9251 | 0.9120 | 0.9546 | 0.9861 | 0.9809 | 0.9202 | 0.9408 | 0.8869 | 0.9502 | 0.8860 | 0.9207 | 0.9693 | 0.9725 | 0.9107 | 0.9872 | 0.8675 | 0.7147 | 0.6585 | 0.0123 | 0.8368 | 0.8193 | 0.5701 | 0.9953 | 0.9877 | 0.9302 | 0.9602 | 0.9568 | 0.9695 | 0.9751 | 0.0 | 0.8265 | 0.7527 | 0.9133 | 0.9728 | 0.9297 | 0.9333 | 0.8225 | 0.9957 | 0.8108 | 0.9247 | 1.0 | 0.9533 | 0.9174 | 0.9862 | 0.9759 | 0.8829 | 0.9340 | 0.9829 | 0.9791 | 0.9703 | 0.9521 | 0.9349 | 0.9816 | 0.9982 | 0.9813 | 0.9743 | 0.9712 | 0.9526 | 0.8184 |
0.0793 | 4.0 | 4352 | 0.1166 | 0.8968 | 0.9294 | 0.9128 | 0.9555 | 0.9816 | 0.9853 | 0.9256 | 0.9514 | 0.9206 | 0.8850 | 0.9081 | 0.9223 | 0.9722 | 0.9769 | 0.9107 | 0.9952 | 0.8934 | 0.7098 | 0.7304 | 0.1316 | 0.8543 | 0.7954 | 0.6306 | 0.9953 | 0.9789 | 0.9388 | 0.9600 | 0.9645 | 0.9863 | 0.9559 | 0.0707 | 0.7875 | 0.7765 | 0.9058 | 0.9721 | 0.9291 | 0.9426 | 0.7036 | 0.9744 | 0.8076 | 0.9394 | 1.0 | 0.9651 | 0.9392 | 0.9903 | 0.9805 | 0.8970 | 0.9352 | 0.9841 | 0.9751 | 0.9795 | 0.9718 | 0.9129 | 0.9772 | 0.9955 | 0.9780 | 0.9793 | 0.9329 | 0.9753 | 0.8933 |
0.0625 | 5.0 | 5440 | 0.1284 | 0.9022 | 0.9339 | 0.9178 | 0.9573 | 0.9889 | 0.9817 | 0.9278 | 0.9650 | 0.9427 | 0.9145 | 0.9143 | 0.9510 | 0.9760 | 0.9826 | 0.9432 | 0.9936 | 0.8812 | 0.6920 | 0.7529 | 0.3642 | 0.8702 | 0.8235 | 0.6588 | 0.9982 | 0.9877 | 0.9408 | 0.9693 | 0.9723 | 0.9931 | 0.9761 | 0.2130 | 0.7683 | 0.7055 | 0.9149 | 0.9801 | 0.9394 | 0.9389 | 0.7842 | 0.9787 | 0.8047 | 0.9388 | 1.0 | 0.9710 | 0.9698 | 0.9890 | 0.9815 | 0.9329 | 0.9351 | 0.9861 | 0.9772 | 0.9744 | 0.9713 | 0.9361 | 0.9735 | 1.0 | 0.9823 | 0.9883 | 0.9744 | 0.9756 | 0.8794 |
0.0402 | 6.0 | 6528 | 0.1608 | 0.9100 | 0.9334 | 0.9216 | 0.9578 | 0.9926 | 0.9835 | 0.9295 | 0.9634 | 0.9091 | 0.9405 | 0.9081 | 0.9517 | 0.9788 | 0.9806 | 0.9419 | 0.9904 | 0.8960 | 0.7107 | 0.7635 | 0.3600 | 0.8756 | 0.8438 | 0.6620 | 0.9982 | 0.9877 | 0.9464 | 0.9667 | 0.9722 | 0.9931 | 0.9704 | 0.2265 | 0.7973 | 0.7070 | 0.9187 | 0.9777 | 0.9392 | 0.9476 | 0.8412 | 0.9892 | 0.8187 | 0.9368 | 1.0 | 0.9710 | 0.9581 | 0.9890 | 0.9826 | 0.9231 | 0.9195 | 0.9872 | 0.9800 | 0.9806 | 0.9669 | 0.9398 | 0.9744 | 1.0 | 0.9779 | 0.9875 | 0.9712 | 0.9622 | 0.8785 |
0.0211 | 7.0 | 7616 | 0.1862 | 0.9040 | 0.9354 | 0.9194 | 0.9567 | 0.9907 | 0.9872 | 0.9297 | 0.9664 | 0.9524 | 0.9489 | 0.9135 | 0.9535 | 0.9836 | 0.9816 | 0.9507 | 0.9920 | 0.8856 | 0.6804 | 0.7692 | 0.3585 | 0.8763 | 0.8366 | 0.6809 | 0.9982 | 0.9877 | 0.9524 | 0.9708 | 0.9679 | 0.9897 | 0.9797 | 0.2845 | 0.7481 | 0.6489 | 0.9235 | 0.9794 | 0.9367 | 0.9480 | 0.8338 | 0.9787 | 0.8172 | 0.9422 | 1.0 | 0.9711 | 0.9699 | 0.9903 | 0.9836 | 0.9193 | 0.9368 | 0.9872 | 0.9820 | 0.9775 | 0.9726 | 0.9389 | 0.9789 | 1.0 | 0.9790 | 0.9899 | 0.9935 | 0.9756 | 0.8908 |
Framework versions
- Transformers 4.40.0
- Pytorch 2.2.1+cu121
- Datasets 2.19.0
- Tokenizers 0.19.1
- Downloads last month
- 2