distilbert-base-cased-pii-en

This model is a fine-tuned version of distilbert-base-cased on English samples from ai4privacy/pii-masking-300k.

Usage:

from transformers import pipeline

pipe = pipeline("token-classification", model="yonigo/distilbert-base-cased-pii-en", aggregation_strategy="first")
pipe("My name is Yoni Go and I live in Israel. My phone number is 054-1234567")

training code git

Training results

Training Loss Epoch Step Validation Loss Bod F1 Building F1 Cardissuer F1 City F1 Country F1 Date F1 Driverlicense F1 Email F1 Geocoord F1 Givenname1 F1 Givenname2 F1 Idcard F1 Ip F1 Lastname1 F1 Lastname2 F1 Lastname3 F1 Pass F1 Passport F1 Postcode F1 Secaddress F1 Sex F1 Socialnumber F1 State F1 Street F1 Tel F1 Time F1 Title F1 Username F1 Precision Recall F1 Accuracy
0.23 2.1368 1000 0.1111 0.8804 0.9218 0.0 0.5854 0.8890 0.8009 0.5896 0.9469 0.7002 0.4149 0.0 0.5740 0.9056 0.5230 0.0 0.0 0.6551 0.6301 0.7819 0.5871 0.8396 0.7137 0.8771 0.6466 0.8567 0.9185 0.7411 0.8202 0.7127 0.7640 0.7374 0.9723
0.0687 4.2735 2000 0.0541 0.9365 0.9688 0.0 0.9115 0.9434 0.8841 0.8530 0.9768 0.9698 0.7276 0.3073 0.8720 0.9701 0.6430 0.1854 0.0 0.8169 0.8857 0.9638 0.9617 0.9312 0.8927 0.9418 0.9231 0.9438 0.9461 0.9232 0.9138 0.8698 0.9020 0.8856 0.9862
0.0452 6.4103 3000 0.0469 0.9550 0.9753 0.0 0.9404 0.9620 0.9014 0.9004 0.9831 0.9721 0.7664 0.4606 0.9099 0.97 0.6838 0.3818 0.0 0.8611 0.9079 0.9757 0.9715 0.9661 0.9197 0.9674 0.9435 0.9448 0.96 0.9354 0.9357 0.8975 0.9212 0.9092 0.9883
0.0311 8.5470 4000 0.0434 0.9581 0.9753 0.0 0.9481 0.9641 0.8973 0.9217 0.9833 0.9745 0.8085 0.6085 0.9232 0.9781 0.7519 0.4891 0.0849 0.8685 0.9280 0.9771 0.9704 0.9659 0.9278 0.9729 0.9549 0.9563 0.9594 0.9564 0.9407 0.9117 0.9344 0.9229 0.9897
0.023 10.6838 5000 0.0416 0.9578 0.9778 0.0 0.9503 0.9639 0.9043 0.9245 0.9804 0.9767 0.8277 0.6607 0.9031 0.9836 0.7790 0.5779 0.25 0.8791 0.9276 0.9767 0.9660 0.9708 0.9353 0.9749 0.9610 0.9681 0.9611 0.9545 0.9382 0.9185 0.9363 0.9273 0.9903
0.016 12.8205 6000 0.0436 0.9607 0.9778 0.0 0.9516 0.9698 0.9027 0.9274 0.9781 0.9677 0.8472 0.6913 0.9259 0.9877 0.7857 0.5925 0.3819 0.8853 0.9377 0.9731 0.9688 0.9752 0.9461 0.9756 0.9594 0.9613 0.9568 0.9545 0.9447 0.9247 0.9396 0.9321 0.9907
0.012 14.9573 7000 0.0447 0.9598 0.9808 0.0 0.9558 0.9737 0.9070 0.9339 0.9719 0.9654 0.8482 0.7009 0.9280 0.9859 0.7879 0.6027 0.4729 0.8909 0.9413 0.9781 0.9750 0.9740 0.9431 0.9778 0.9624 0.9623 0.9665 0.9555 0.9403 0.9275 0.9413 0.9344 0.9911
0.0086 17.0940 8000 0.0487 0.9584 0.9814 0.0 0.9529 0.9733 0.9114 0.9350 0.9857 0.9767 0.8493 0.7059 0.9306 0.9829 0.7918 0.6216 0.5292 0.8934 0.9330 0.9790 0.9766 0.9750 0.9395 0.9747 0.9624 0.9605 0.9631 0.9580 0.9399 0.9282 0.9423 0.9352 0.9908
0.0062 19.2308 9000 0.0509 0.9596 0.9795 0.0 0.9594 0.9708 0.9128 0.9299 0.9872 0.9837 0.8491 0.7188 0.9270 0.9859 0.7923 0.6468 0.5371 0.8919 0.9369 0.9783 0.9756 0.9749 0.9382 0.9778 0.9650 0.9647 0.9653 0.9559 0.9461 0.9337 0.9394 0.9365 0.9911
0.0045 21.3675 10000 0.0548 0.9559 0.9774 0.0 0.9524 0.9720 0.9080 0.9348 0.9827 0.9814 0.8446 0.7117 0.9271 0.9800 0.7977 0.6428 0.5266 0.8964 0.9351 0.9754 0.9716 0.9728 0.9478 0.9757 0.9584 0.9698 0.9587 0.9548 0.9423 0.9242 0.9457 0.9348 0.9907
0.0036 23.5043 11000 0.0560 0.9594 0.9781 0.0 0.9575 0.9720 0.9121 0.9367 0.9814 0.9814 0.8504 0.7209 0.9317 0.9807 0.7922 0.6507 0.5918 0.8864 0.9380 0.9769 0.9722 0.9745 0.9399 0.9771 0.9628 0.9675 0.9618 0.9581 0.9446 0.9288 0.9434 0.9361 0.9910
0.0026 25.6410 12000 0.0576 0.9596 0.9798 0.0 0.9575 0.9732 0.9130 0.9308 0.9831 0.9791 0.8471 0.7104 0.9268 0.9836 0.7967 0.6563 0.6222 0.8982 0.9345 0.9771 0.9739 0.9733 0.9402 0.9771 0.9673 0.9656 0.9642 0.9576 0.9480 0.9288 0.9446 0.9366 0.9910
0.002 27.7778 13000 0.0608 0.9555 0.9796 0.0 0.9561 0.9717 0.9045 0.9319 0.9848 0.9791 0.8488 0.7157 0.9268 0.9852 0.7909 0.6580 0.6039 0.8900 0.9360 0.9802 0.9717 0.9750 0.9361 0.9778 0.9646 0.9683 0.9615 0.9565 0.9465 0.9279 0.9433 0.9355 0.9909
0.0016 29.9145 14000 0.0601 0.9573 0.9801 0.0 0.9589 0.9722 0.9135 0.9353 0.9848 0.9837 0.8499 0.7202 0.9316 0.9871 0.7942 0.6677 0.6432 0.9017 0.9402 0.9799 0.9744 0.9740 0.9455 0.9783 0.9631 0.9719 0.9645 0.9600 0.9482 0.9331 0.9443 0.9387 0.9913
0.0013 32.0513 15000 0.0613 0.9606 0.9798 0.0 0.9571 0.9739 0.9155 0.9365 0.9840 0.9791 0.8466 0.7221 0.9321 0.9861 0.7948 0.6622 0.6281 0.9017 0.9407 0.9787 0.9739 0.9740 0.9475 0.9774 0.9632 0.9693 0.9644 0.9594 0.9469 0.9313 0.9457 0.9384 0.9912
0.001 34.1880 16000 0.0639 0.9611 0.9808 0.0 0.9601 0.9729 0.9149 0.9337 0.9867 0.9814 0.8483 0.7269 0.9307 0.9838 0.7956 0.6627 0.6154 0.9025 0.9397 0.9797 0.9733 0.9738 0.9383 0.9776 0.9636 0.9674 0.9637 0.9576 0.9480 0.9304 0.9457 0.9380 0.9911
0.0009 36.3248 17000 0.0621 0.9622 0.9811 0.0 0.9604 0.9741 0.9156 0.9359 0.9855 0.9814 0.8510 0.7273 0.9319 0.9859 0.7991 0.6646 0.6413 0.8999 0.9393 0.9789 0.9739 0.9740 0.9427 0.9789 0.9653 0.9687 0.9637 0.9597 0.9460 0.9324 0.9456 0.9390 0.9913
0.0008 38.4615 18000 0.0631 0.9620 0.9801 0.0 0.9582 0.9744 0.9190 0.9350 0.9853 0.9814 0.8514 0.7253 0.9298 0.9848 0.7992 0.6677 0.6434 0.8992 0.9401 0.9797 0.9739 0.9731 0.9421 0.9789 0.9653 0.9681 0.9643 0.9592 0.9462 0.9319 0.9457 0.9388 0.9913
0.0007 40.5983 19000 0.0633 0.9615 0.9806 0.0 0.9589 0.9741 0.9170 0.9358 0.9861 0.9814 0.8501 0.7268 0.9316 0.9857 0.7973 0.6662 0.6393 0.9002 0.9405 0.9797 0.9745 0.9738 0.9442 0.9796 0.9658 0.9679 0.9644 0.9586 0.9469 0.9323 0.9459 0.9391 0.9913
0.0008 42.7350 20000 0.0635 0.9622 0.9808 0.0 0.9589 0.9741 0.9176 0.9355 0.9855 0.9814 0.8496 0.7261 0.9306 0.9850 0.7974 0.6672 0.6393 0.9005 0.9405 0.9794 0.9745 0.9738 0.9437 0.9794 0.9651 0.9674 0.9638 0.9587 0.9469 0.9318 0.9460 0.9388 0.9913

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
16
Safetensors
Model size
65.2M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for yonigo/distilbert-base-cased-pii-en

Finetuned
(224)
this model

Dataset used to train yonigo/distilbert-base-cased-pii-en