File size: 11,393 Bytes
bda96d1
 
 
 
 
 
 
 
 
 
 
 
 
e468d4e
 
 
 
 
5782da1
 
 
e36922f
 
 
 
bda96d1
 
 
 
 
 
 
f804b23
bda96d1
918069a
130c883
918069a
bda96d1
918069a
 
 
bda96d1
918069a
bda96d1
 
 
 
 
 
ac5ee9b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bda96d1
 
 
 
 
 
 
e468d4e
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
license: apache-2.0
base_model: distilbert-base-cased
tags:
- generated_from_trainer
metrics:
- precision
- recall
- f1
- accuracy
model-index:
- name: distilbert-base-cased-pii-en
  results: []
datasets:
- ai4privacy/pii-masking-300k
language:
- en
pipeline_tag: token-classification

widget:
- text: "My name is Yoni Go and I live in Israel. My phone number is 054-1234567"

inference:
  parameters:
    aggregation_strategy: "first"
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# distilbert-base-cased-pii-en

This model is a fine-tuned version of [distilbert-base-cased](https://huggingface.co/distilbert-base-cased) on English samples from [ai4privacy/pii-masking-300k](https://huggingface.co/datasets/ai4privacy/pii-masking-300k).

Usage:
```python
from transformers import pipeline

pipe = pipeline("token-classification", model="yonigo/distilbert-base-cased-pii-en", aggregation_strategy="first")
pipe("My name is Yoni Go and I live in Israel. My phone number is 054-1234567")
```

training code [git](https://github.com/yonigottesman/pii-model)




### Training results

| Training Loss | Epoch   | Step  | Validation Loss | Bod F1 | Building F1 | Cardissuer F1 | City F1 | Country F1 | Date F1 | Driverlicense F1 | Email F1 | Geocoord F1 | Givenname1 F1 | Givenname2 F1 | Idcard F1 | Ip F1  | Lastname1 F1 | Lastname2 F1 | Lastname3 F1 | Pass F1 | Passport F1 | Postcode F1 | Secaddress F1 | Sex F1 | Socialnumber F1 | State F1 | Street F1 | Tel F1 | Time F1 | Title F1 | Username F1 | Precision | Recall | F1     | Accuracy |
|:-------------:|:-------:|:-----:|:---------------:|:------:|:-----------:|:-------------:|:-------:|:----------:|:-------:|:----------------:|:--------:|:-----------:|:-------------:|:-------------:|:---------:|:------:|:------------:|:------------:|:------------:|:-------:|:-----------:|:-----------:|:-------------:|:------:|:---------------:|:--------:|:---------:|:------:|:-------:|:--------:|:-----------:|:---------:|:------:|:------:|:--------:|
| 0.23          | 2.1368  | 1000  | 0.1111          | 0.8804 | 0.9218      | 0.0           | 0.5854  | 0.8890     | 0.8009  | 0.5896           | 0.9469   | 0.7002      | 0.4149        | 0.0           | 0.5740    | 0.9056 | 0.5230       | 0.0          | 0.0          | 0.6551  | 0.6301      | 0.7819      | 0.5871        | 0.8396 | 0.7137          | 0.8771   | 0.6466    | 0.8567 | 0.9185  | 0.7411   | 0.8202      | 0.7127    | 0.7640 | 0.7374 | 0.9723   |
| 0.0687        | 4.2735  | 2000  | 0.0541          | 0.9365 | 0.9688      | 0.0           | 0.9115  | 0.9434     | 0.8841  | 0.8530           | 0.9768   | 0.9698      | 0.7276        | 0.3073        | 0.8720    | 0.9701 | 0.6430       | 0.1854       | 0.0          | 0.8169  | 0.8857      | 0.9638      | 0.9617        | 0.9312 | 0.8927          | 0.9418   | 0.9231    | 0.9438 | 0.9461  | 0.9232   | 0.9138      | 0.8698    | 0.9020 | 0.8856 | 0.9862   |
| 0.0452        | 6.4103  | 3000  | 0.0469          | 0.9550 | 0.9753      | 0.0           | 0.9404  | 0.9620     | 0.9014  | 0.9004           | 0.9831   | 0.9721      | 0.7664        | 0.4606        | 0.9099    | 0.97   | 0.6838       | 0.3818       | 0.0          | 0.8611  | 0.9079      | 0.9757      | 0.9715        | 0.9661 | 0.9197          | 0.9674   | 0.9435    | 0.9448 | 0.96    | 0.9354   | 0.9357      | 0.8975    | 0.9212 | 0.9092 | 0.9883   |
| 0.0311        | 8.5470  | 4000  | 0.0434          | 0.9581 | 0.9753      | 0.0           | 0.9481  | 0.9641     | 0.8973  | 0.9217           | 0.9833   | 0.9745      | 0.8085        | 0.6085        | 0.9232    | 0.9781 | 0.7519       | 0.4891       | 0.0849       | 0.8685  | 0.9280      | 0.9771      | 0.9704        | 0.9659 | 0.9278          | 0.9729   | 0.9549    | 0.9563 | 0.9594  | 0.9564   | 0.9407      | 0.9117    | 0.9344 | 0.9229 | 0.9897   |
| 0.023         | 10.6838 | 5000  | 0.0416          | 0.9578 | 0.9778      | 0.0           | 0.9503  | 0.9639     | 0.9043  | 0.9245           | 0.9804   | 0.9767      | 0.8277        | 0.6607        | 0.9031    | 0.9836 | 0.7790       | 0.5779       | 0.25         | 0.8791  | 0.9276      | 0.9767      | 0.9660        | 0.9708 | 0.9353          | 0.9749   | 0.9610    | 0.9681 | 0.9611  | 0.9545   | 0.9382      | 0.9185    | 0.9363 | 0.9273 | 0.9903   |
| 0.016         | 12.8205 | 6000  | 0.0436          | 0.9607 | 0.9778      | 0.0           | 0.9516  | 0.9698     | 0.9027  | 0.9274           | 0.9781   | 0.9677      | 0.8472        | 0.6913        | 0.9259    | 0.9877 | 0.7857       | 0.5925       | 0.3819       | 0.8853  | 0.9377      | 0.9731      | 0.9688        | 0.9752 | 0.9461          | 0.9756   | 0.9594    | 0.9613 | 0.9568  | 0.9545   | 0.9447      | 0.9247    | 0.9396 | 0.9321 | 0.9907   |
| 0.012         | 14.9573 | 7000  | 0.0447          | 0.9598 | 0.9808      | 0.0           | 0.9558  | 0.9737     | 0.9070  | 0.9339           | 0.9719   | 0.9654      | 0.8482        | 0.7009        | 0.9280    | 0.9859 | 0.7879       | 0.6027       | 0.4729       | 0.8909  | 0.9413      | 0.9781      | 0.9750        | 0.9740 | 0.9431          | 0.9778   | 0.9624    | 0.9623 | 0.9665  | 0.9555   | 0.9403      | 0.9275    | 0.9413 | 0.9344 | 0.9911   |
| 0.0086        | 17.0940 | 8000  | 0.0487          | 0.9584 | 0.9814      | 0.0           | 0.9529  | 0.9733     | 0.9114  | 0.9350           | 0.9857   | 0.9767      | 0.8493        | 0.7059        | 0.9306    | 0.9829 | 0.7918       | 0.6216       | 0.5292       | 0.8934  | 0.9330      | 0.9790      | 0.9766        | 0.9750 | 0.9395          | 0.9747   | 0.9624    | 0.9605 | 0.9631  | 0.9580   | 0.9399      | 0.9282    | 0.9423 | 0.9352 | 0.9908   |
| 0.0062        | 19.2308 | 9000  | 0.0509          | 0.9596 | 0.9795      | 0.0           | 0.9594  | 0.9708     | 0.9128  | 0.9299           | 0.9872   | 0.9837      | 0.8491        | 0.7188        | 0.9270    | 0.9859 | 0.7923       | 0.6468       | 0.5371       | 0.8919  | 0.9369      | 0.9783      | 0.9756        | 0.9749 | 0.9382          | 0.9778   | 0.9650    | 0.9647 | 0.9653  | 0.9559   | 0.9461      | 0.9337    | 0.9394 | 0.9365 | 0.9911   |
| 0.0045        | 21.3675 | 10000 | 0.0548          | 0.9559 | 0.9774      | 0.0           | 0.9524  | 0.9720     | 0.9080  | 0.9348           | 0.9827   | 0.9814      | 0.8446        | 0.7117        | 0.9271    | 0.9800 | 0.7977       | 0.6428       | 0.5266       | 0.8964  | 0.9351      | 0.9754      | 0.9716        | 0.9728 | 0.9478          | 0.9757   | 0.9584    | 0.9698 | 0.9587  | 0.9548   | 0.9423      | 0.9242    | 0.9457 | 0.9348 | 0.9907   |
| 0.0036        | 23.5043 | 11000 | 0.0560          | 0.9594 | 0.9781      | 0.0           | 0.9575  | 0.9720     | 0.9121  | 0.9367           | 0.9814   | 0.9814      | 0.8504        | 0.7209        | 0.9317    | 0.9807 | 0.7922       | 0.6507       | 0.5918       | 0.8864  | 0.9380      | 0.9769      | 0.9722        | 0.9745 | 0.9399          | 0.9771   | 0.9628    | 0.9675 | 0.9618  | 0.9581   | 0.9446      | 0.9288    | 0.9434 | 0.9361 | 0.9910   |
| 0.0026        | 25.6410 | 12000 | 0.0576          | 0.9596 | 0.9798      | 0.0           | 0.9575  | 0.9732     | 0.9130  | 0.9308           | 0.9831   | 0.9791      | 0.8471        | 0.7104        | 0.9268    | 0.9836 | 0.7967       | 0.6563       | 0.6222       | 0.8982  | 0.9345      | 0.9771      | 0.9739        | 0.9733 | 0.9402          | 0.9771   | 0.9673    | 0.9656 | 0.9642  | 0.9576   | 0.9480      | 0.9288    | 0.9446 | 0.9366 | 0.9910   |
| 0.002         | 27.7778 | 13000 | 0.0608          | 0.9555 | 0.9796      | 0.0           | 0.9561  | 0.9717     | 0.9045  | 0.9319           | 0.9848   | 0.9791      | 0.8488        | 0.7157        | 0.9268    | 0.9852 | 0.7909       | 0.6580       | 0.6039       | 0.8900  | 0.9360      | 0.9802      | 0.9717        | 0.9750 | 0.9361          | 0.9778   | 0.9646    | 0.9683 | 0.9615  | 0.9565   | 0.9465      | 0.9279    | 0.9433 | 0.9355 | 0.9909   |
| 0.0016        | 29.9145 | 14000 | 0.0601          | 0.9573 | 0.9801      | 0.0           | 0.9589  | 0.9722     | 0.9135  | 0.9353           | 0.9848   | 0.9837      | 0.8499        | 0.7202        | 0.9316    | 0.9871 | 0.7942       | 0.6677       | 0.6432       | 0.9017  | 0.9402      | 0.9799      | 0.9744        | 0.9740 | 0.9455          | 0.9783   | 0.9631    | 0.9719 | 0.9645  | 0.9600   | 0.9482      | 0.9331    | 0.9443 | 0.9387 | 0.9913   |
| 0.0013        | 32.0513 | 15000 | 0.0613          | 0.9606 | 0.9798      | 0.0           | 0.9571  | 0.9739     | 0.9155  | 0.9365           | 0.9840   | 0.9791      | 0.8466        | 0.7221        | 0.9321    | 0.9861 | 0.7948       | 0.6622       | 0.6281       | 0.9017  | 0.9407      | 0.9787      | 0.9739        | 0.9740 | 0.9475          | 0.9774   | 0.9632    | 0.9693 | 0.9644  | 0.9594   | 0.9469      | 0.9313    | 0.9457 | 0.9384 | 0.9912   |
| 0.001         | 34.1880 | 16000 | 0.0639          | 0.9611 | 0.9808      | 0.0           | 0.9601  | 0.9729     | 0.9149  | 0.9337           | 0.9867   | 0.9814      | 0.8483        | 0.7269        | 0.9307    | 0.9838 | 0.7956       | 0.6627       | 0.6154       | 0.9025  | 0.9397      | 0.9797      | 0.9733        | 0.9738 | 0.9383          | 0.9776   | 0.9636    | 0.9674 | 0.9637  | 0.9576   | 0.9480      | 0.9304    | 0.9457 | 0.9380 | 0.9911   |
| 0.0009        | 36.3248 | 17000 | 0.0621          | 0.9622 | 0.9811      | 0.0           | 0.9604  | 0.9741     | 0.9156  | 0.9359           | 0.9855   | 0.9814      | 0.8510        | 0.7273        | 0.9319    | 0.9859 | 0.7991       | 0.6646       | 0.6413       | 0.8999  | 0.9393      | 0.9789      | 0.9739        | 0.9740 | 0.9427          | 0.9789   | 0.9653    | 0.9687 | 0.9637  | 0.9597   | 0.9460      | 0.9324    | 0.9456 | 0.9390 | 0.9913   |
| 0.0008        | 38.4615 | 18000 | 0.0631          | 0.9620 | 0.9801      | 0.0           | 0.9582  | 0.9744     | 0.9190  | 0.9350           | 0.9853   | 0.9814      | 0.8514        | 0.7253        | 0.9298    | 0.9848 | 0.7992       | 0.6677       | 0.6434       | 0.8992  | 0.9401      | 0.9797      | 0.9739        | 0.9731 | 0.9421          | 0.9789   | 0.9653    | 0.9681 | 0.9643  | 0.9592   | 0.9462      | 0.9319    | 0.9457 | 0.9388 | 0.9913   |
| 0.0007        | 40.5983 | 19000 | 0.0633          | 0.9615 | 0.9806      | 0.0           | 0.9589  | 0.9741     | 0.9170  | 0.9358           | 0.9861   | 0.9814      | 0.8501        | 0.7268        | 0.9316    | 0.9857 | 0.7973       | 0.6662       | 0.6393       | 0.9002  | 0.9405      | 0.9797      | 0.9745        | 0.9738 | 0.9442          | 0.9796   | 0.9658    | 0.9679 | 0.9644  | 0.9586   | 0.9469      | 0.9323    | 0.9459 | 0.9391 | 0.9913   |
| 0.0008        | 42.7350 | 20000 | 0.0635          | 0.9622 | 0.9808      | 0.0           | 0.9589  | 0.9741     | 0.9176  | 0.9355           | 0.9855   | 0.9814      | 0.8496        | 0.7261        | 0.9306    | 0.9850 | 0.7974       | 0.6672       | 0.6393       | 0.9005  | 0.9405      | 0.9794      | 0.9745        | 0.9738 | 0.9437          | 0.9794   | 0.9651    | 0.9674 | 0.9638  | 0.9587   | 0.9469      | 0.9318    | 0.9460 | 0.9388 | 0.9913   |


### Framework versions

- Transformers 4.41.2
- Pytorch 2.3.1+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1