🌟 Buying me coffee is a direct way to show support for this project.

distilbert-base-german-cased_finetuned_ai4privacy_v2

This model is a fine-tuned version of distilbert-base-german-cased on the German subset of pii-masking-200k dataset. It achieves the following results on the evaluation set:

Loss: 0.0821
Overall Precision: 0.9086
Overall Recall: 0.9379
Overall F1: 0.9230
Overall Accuracy: 0.9679

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.2
num_epochs: 5

Training results

Training Loss	Epoch	Step	Validation Loss	Overall Precision	Overall Recall	Overall F1	Overall Accuracy	Accountname F1	Accountnumber F1	Age F1	Amount F1	Bic F1	Bitcoinaddress F1	Buildingnumber F1	City F1	Companyname F1	County F1	Creditcardcvv F1	Creditcardissuer F1	Creditcardnumber F1	Currency F1	Currencycode F1	Currencyname F1	Currencysymbol F1	Date F1	Dob F1	Email F1	Ethereumaddress F1	Eyecolor F1	Firstname F1	Gender F1	Height F1	Iban F1	Ip F1	Ipv4 F1	Ipv6 F1	Jobarea F1	Jobtitle F1	Jobtype F1	Lastname F1	Litecoinaddress F1	Mac F1	Maskednumber F1	Middlename F1	Nearbygpscoordinate F1	Ordinaldirection F1	Password F1	Phoneimei F1	Phonenumber F1	Pin F1	Prefix F1	Secondaryaddress F1	Sex F1	Ssn F1	State F1	Street F1	Time F1	Url F1	Useragent F1	Username F1	Vehiclevin F1	Vehiclevrm F1	Zipcode F1
0.1449	1.0	5282	0.1365	0.8213	0.8741	0.8469	0.9504	0.9954	0.9180	0.9509	0.7478	0.8315	0.8265	0.7908	0.8030	0.9011	0.9118	0.8669	0.9831	0.8053	0.4935	0.6482	0.0	0.8430	0.7672	0.4751	0.9870	0.9103	0.9501	0.8810	0.9552	0.9507	0.9086	0.0	0.8124	0.7776	0.8698	0.9758	0.9445	0.8140	0.5210	0.9819	0.6555	0.4114	1.0	0.9837	0.8093	0.9761	0.9254	0.7705	0.8613	0.9676	0.9978	0.9570	0.8585	0.8164	0.9643	0.9879	0.9534	0.9415	0.8778	0.9716	0.7313
0.1039	2.0	10564	0.0841	0.8875	0.9213	0.9041	0.9649	0.9923	0.9598	0.9721	0.8979	0.9240	0.9218	0.8937	0.8803	0.9648	0.9595	0.9563	0.9848	0.8427	0.5724	0.7677	0.2210	0.9244	0.8003	0.5866	0.9932	0.9636	0.9835	0.9473	0.9794	0.9753	0.9644	0.0173	0.7042	0.7564	0.9439	0.9911	0.9710	0.8988	0.7288	0.9801	0.7913	0.8977	0.9978	0.9853	0.9581	0.9937	0.9761	0.9146	0.9166	0.9741	0.9978	0.9787	0.9448	0.9031	0.9591	0.9968	0.9638	0.9719	0.9455	0.9829	0.8863
0.0804	3.0	15846	0.0821	0.9086	0.9379	0.9230	0.9679	0.9985	0.9849	0.9792	0.9387	0.9641	0.9637	0.9011	0.9260	0.9782	0.9778	0.9543	1.0	0.8796	0.7027	0.8328	0.3466	0.9420	0.8156	0.6575	0.9971	0.9947	0.9833	0.9614	0.9881	0.9842	0.9819	0.2023	0.6631	0.7243	0.9722	0.9904	0.9725	0.9185	0.8545	0.9780	0.8365	0.9156	1.0	0.9853	0.9782	0.9947	0.9883	0.9189	0.9594	0.9831	0.9993	0.9898	0.9739	0.9355	0.9764	0.9984	0.9885	0.9798	0.9614	1.0	0.9100
0.0622	4.0	21128	0.0848	0.9095	0.9420	0.9255	0.9713	0.9977	0.9932	0.9815	0.9566	0.9550	0.9704	0.9187	0.9277	0.9735	0.9756	0.9679	0.9966	0.8885	0.6985	0.8598	0.4217	0.9602	0.8262	0.6809	0.9960	0.9947	0.9852	0.9641	0.9952	0.9955	0.9909	0.3053	0.7067	0.6156	0.9784	0.9948	0.9773	0.9176	0.8856	0.9880	0.8598	0.9186	1.0	0.9886	0.9871	0.9968	0.9916	0.9419	0.9621	0.9887	1.0	0.9926	0.9717	0.9441	0.9835	0.9992	0.9858	0.9838	0.9818	0.9856	0.8972
0.032	5.0	26410	0.0998	0.9210	0.9497	0.9351	0.9741	0.9985	0.9962	0.9847	0.9622	0.9614	0.9738	0.9269	0.9431	0.9782	0.9749	0.9708	0.9949	0.8990	0.7116	0.8447	0.4615	0.9646	0.8296	0.7235	0.9966	0.9947	0.9853	0.9672	0.9929	0.9932	0.9919	0.3706	0.7690	0.6836	0.9838	0.9941	0.9789	0.9252	0.8876	0.9960	0.8849	0.9172	1.0	0.9886	0.9847	0.9958	0.9925	0.9483	0.9700	0.9912	1.0	0.9944	0.9756	0.9468	0.99	0.9984	0.9947	0.9806	0.9939	1.0	0.9108

Framework versions

Transformers 4.35.2
Pytorch 2.1.0+cu121
Datasets 2.16.1
Tokenizers 0.15.0

Isotonic
/

distilbert-base-german-cased_finetuned_ai4privacy_v2

distilbert-base-german-cased_finetuned_ai4privacy_v2

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Isotonic/distilbert-base-german-cased_finetuned_ai4privacy_v2

Datasets used to train Isotonic/distilbert-base-german-cased_finetuned_ai4privacy_v2

Evaluation results