|
2023-10-14 10:08:03,990 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:08:03,991 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-14 10:08:03,991 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:08:03,992 MultiCorpus: 5777 train + 722 dev + 723 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl |
|
2023-10-14 10:08:03,992 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:08:03,992 Train: 5777 sentences |
|
2023-10-14 10:08:03,992 (train_with_dev=False, train_with_test=False) |
|
2023-10-14 10:08:03,992 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:08:03,992 Training Params: |
|
2023-10-14 10:08:03,992 - learning_rate: "3e-05" |
|
2023-10-14 10:08:03,992 - mini_batch_size: "8" |
|
2023-10-14 10:08:03,992 - max_epochs: "10" |
|
2023-10-14 10:08:03,992 - shuffle: "True" |
|
2023-10-14 10:08:03,992 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:08:03,992 Plugins: |
|
2023-10-14 10:08:03,992 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-14 10:08:03,992 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:08:03,992 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-14 10:08:03,992 - metric: "('micro avg', 'f1-score')" |
|
2023-10-14 10:08:03,992 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:08:03,992 Computation: |
|
2023-10-14 10:08:03,992 - compute on device: cuda:0 |
|
2023-10-14 10:08:03,992 - embedding storage: none |
|
2023-10-14 10:08:03,992 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:08:03,992 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-14 10:08:03,992 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:08:03,992 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:08:09,798 epoch 1 - iter 72/723 - loss 2.06007185 - time (sec): 5.80 - samples/sec: 2987.50 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 10:08:15,437 epoch 1 - iter 144/723 - loss 1.20291780 - time (sec): 11.44 - samples/sec: 3042.63 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 10:08:20,979 epoch 1 - iter 216/723 - loss 0.87883238 - time (sec): 16.99 - samples/sec: 3054.73 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 10:08:26,795 epoch 1 - iter 288/723 - loss 0.71332727 - time (sec): 22.80 - samples/sec: 3045.17 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 10:08:33,145 epoch 1 - iter 360/723 - loss 0.59359558 - time (sec): 29.15 - samples/sec: 3051.16 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 10:08:39,305 epoch 1 - iter 432/723 - loss 0.52859171 - time (sec): 35.31 - samples/sec: 2996.88 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 10:08:44,793 epoch 1 - iter 504/723 - loss 0.47878282 - time (sec): 40.80 - samples/sec: 3002.83 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 10:08:50,941 epoch 1 - iter 576/723 - loss 0.43557998 - time (sec): 46.95 - samples/sec: 2992.78 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 10:08:56,809 epoch 1 - iter 648/723 - loss 0.40192494 - time (sec): 52.82 - samples/sec: 3001.84 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 10:09:02,756 epoch 1 - iter 720/723 - loss 0.37551955 - time (sec): 58.76 - samples/sec: 2993.44 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 10:09:02,920 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:09:02,920 EPOCH 1 done: loss 0.3754 - lr: 0.000030 |
|
2023-10-14 10:09:06,492 DEV : loss 0.13150866329669952 - f1-score (micro avg) 0.6594 |
|
2023-10-14 10:09:06,508 saving best model |
|
2023-10-14 10:09:07,015 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:09:13,003 epoch 2 - iter 72/723 - loss 0.13450373 - time (sec): 5.99 - samples/sec: 2985.36 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 10:09:18,840 epoch 2 - iter 144/723 - loss 0.12552915 - time (sec): 11.82 - samples/sec: 2966.63 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 10:09:24,322 epoch 2 - iter 216/723 - loss 0.11969064 - time (sec): 17.30 - samples/sec: 3022.02 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 10:09:30,564 epoch 2 - iter 288/723 - loss 0.11742229 - time (sec): 23.55 - samples/sec: 3008.37 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 10:09:36,925 epoch 2 - iter 360/723 - loss 0.11264569 - time (sec): 29.91 - samples/sec: 2988.19 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 10:09:42,706 epoch 2 - iter 432/723 - loss 0.11031740 - time (sec): 35.69 - samples/sec: 2988.22 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 10:09:48,169 epoch 2 - iter 504/723 - loss 0.10834730 - time (sec): 41.15 - samples/sec: 2994.85 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 10:09:53,842 epoch 2 - iter 576/723 - loss 0.10492283 - time (sec): 46.83 - samples/sec: 3012.33 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 10:09:59,792 epoch 2 - iter 648/723 - loss 0.10618747 - time (sec): 52.78 - samples/sec: 3005.38 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 10:10:05,483 epoch 2 - iter 720/723 - loss 0.10530455 - time (sec): 58.47 - samples/sec: 3006.11 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 10:10:05,647 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:10:05,648 EPOCH 2 done: loss 0.1052 - lr: 0.000027 |
|
2023-10-14 10:10:09,197 DEV : loss 0.09088422358036041 - f1-score (micro avg) 0.7818 |
|
2023-10-14 10:10:09,213 saving best model |
|
2023-10-14 10:10:09,756 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:10:15,427 epoch 3 - iter 72/723 - loss 0.07884216 - time (sec): 5.67 - samples/sec: 3092.57 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 10:10:21,258 epoch 3 - iter 144/723 - loss 0.07493376 - time (sec): 11.50 - samples/sec: 3032.42 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 10:10:27,358 epoch 3 - iter 216/723 - loss 0.07038820 - time (sec): 17.60 - samples/sec: 3006.94 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 10:10:33,575 epoch 3 - iter 288/723 - loss 0.07304398 - time (sec): 23.82 - samples/sec: 2969.72 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 10:10:39,604 epoch 3 - iter 360/723 - loss 0.06869986 - time (sec): 29.85 - samples/sec: 2951.49 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 10:10:45,313 epoch 3 - iter 432/723 - loss 0.06666406 - time (sec): 35.56 - samples/sec: 2970.21 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 10:10:51,783 epoch 3 - iter 504/723 - loss 0.06574320 - time (sec): 42.03 - samples/sec: 2951.83 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 10:10:57,354 epoch 3 - iter 576/723 - loss 0.06487572 - time (sec): 47.60 - samples/sec: 2955.14 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 10:11:03,334 epoch 3 - iter 648/723 - loss 0.06527005 - time (sec): 53.58 - samples/sec: 2958.50 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 10:11:09,185 epoch 3 - iter 720/723 - loss 0.06498541 - time (sec): 59.43 - samples/sec: 2956.97 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 10:11:09,364 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:11:09,364 EPOCH 3 done: loss 0.0650 - lr: 0.000023 |
|
2023-10-14 10:11:13,822 DEV : loss 0.08999822288751602 - f1-score (micro avg) 0.8105 |
|
2023-10-14 10:11:13,844 saving best model |
|
2023-10-14 10:11:14,331 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:11:20,163 epoch 4 - iter 72/723 - loss 0.04124378 - time (sec): 5.83 - samples/sec: 2961.66 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 10:11:26,426 epoch 4 - iter 144/723 - loss 0.04255973 - time (sec): 12.09 - samples/sec: 2874.11 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 10:11:32,124 epoch 4 - iter 216/723 - loss 0.04002331 - time (sec): 17.79 - samples/sec: 2872.48 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 10:11:38,075 epoch 4 - iter 288/723 - loss 0.04041452 - time (sec): 23.74 - samples/sec: 2921.20 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 10:11:43,923 epoch 4 - iter 360/723 - loss 0.04054372 - time (sec): 29.59 - samples/sec: 2940.59 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 10:11:50,088 epoch 4 - iter 432/723 - loss 0.04272394 - time (sec): 35.75 - samples/sec: 2951.18 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 10:11:56,210 epoch 4 - iter 504/723 - loss 0.04250104 - time (sec): 41.87 - samples/sec: 2961.09 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 10:12:02,173 epoch 4 - iter 576/723 - loss 0.04238222 - time (sec): 47.84 - samples/sec: 2936.00 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 10:12:07,867 epoch 4 - iter 648/723 - loss 0.04240052 - time (sec): 53.53 - samples/sec: 2942.86 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 10:12:13,999 epoch 4 - iter 720/723 - loss 0.04290273 - time (sec): 59.66 - samples/sec: 2947.30 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 10:12:14,161 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:12:14,162 EPOCH 4 done: loss 0.0428 - lr: 0.000020 |
|
2023-10-14 10:12:17,822 DEV : loss 0.11989317834377289 - f1-score (micro avg) 0.7626 |
|
2023-10-14 10:12:17,844 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:12:25,215 epoch 5 - iter 72/723 - loss 0.02915324 - time (sec): 7.37 - samples/sec: 2537.77 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 10:12:30,969 epoch 5 - iter 144/723 - loss 0.03042922 - time (sec): 13.12 - samples/sec: 2734.26 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 10:12:37,275 epoch 5 - iter 216/723 - loss 0.03176786 - time (sec): 19.43 - samples/sec: 2779.57 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 10:12:43,306 epoch 5 - iter 288/723 - loss 0.03155849 - time (sec): 25.46 - samples/sec: 2814.76 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 10:12:49,220 epoch 5 - iter 360/723 - loss 0.03120094 - time (sec): 31.37 - samples/sec: 2841.19 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 10:12:55,192 epoch 5 - iter 432/723 - loss 0.03150309 - time (sec): 37.35 - samples/sec: 2856.40 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 10:13:01,212 epoch 5 - iter 504/723 - loss 0.03075019 - time (sec): 43.37 - samples/sec: 2845.59 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 10:13:07,103 epoch 5 - iter 576/723 - loss 0.03074046 - time (sec): 49.26 - samples/sec: 2859.84 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 10:13:13,031 epoch 5 - iter 648/723 - loss 0.02971689 - time (sec): 55.19 - samples/sec: 2871.48 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 10:13:19,192 epoch 5 - iter 720/723 - loss 0.03186083 - time (sec): 61.35 - samples/sec: 2863.54 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 10:13:19,371 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:13:19,371 EPOCH 5 done: loss 0.0318 - lr: 0.000017 |
|
2023-10-14 10:13:22,892 DEV : loss 0.11947084218263626 - f1-score (micro avg) 0.807 |
|
2023-10-14 10:13:22,908 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:13:28,744 epoch 6 - iter 72/723 - loss 0.02063196 - time (sec): 5.84 - samples/sec: 2998.70 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 10:13:35,309 epoch 6 - iter 144/723 - loss 0.02278467 - time (sec): 12.40 - samples/sec: 2929.50 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 10:13:41,170 epoch 6 - iter 216/723 - loss 0.02607712 - time (sec): 18.26 - samples/sec: 2935.86 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 10:13:47,349 epoch 6 - iter 288/723 - loss 0.02571781 - time (sec): 24.44 - samples/sec: 2907.07 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 10:13:53,357 epoch 6 - iter 360/723 - loss 0.02737462 - time (sec): 30.45 - samples/sec: 2902.17 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 10:13:59,479 epoch 6 - iter 432/723 - loss 0.02692809 - time (sec): 36.57 - samples/sec: 2921.98 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 10:14:05,524 epoch 6 - iter 504/723 - loss 0.02745476 - time (sec): 42.61 - samples/sec: 2913.83 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 10:14:10,964 epoch 6 - iter 576/723 - loss 0.02632377 - time (sec): 48.06 - samples/sec: 2929.49 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 10:14:16,629 epoch 6 - iter 648/723 - loss 0.02661387 - time (sec): 53.72 - samples/sec: 2927.13 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 10:14:22,775 epoch 6 - iter 720/723 - loss 0.02605300 - time (sec): 59.87 - samples/sec: 2932.12 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 10:14:23,039 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:14:23,040 EPOCH 6 done: loss 0.0260 - lr: 0.000013 |
|
2023-10-14 10:14:26,933 DEV : loss 0.13547733426094055 - f1-score (micro avg) 0.7987 |
|
2023-10-14 10:14:26,948 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:14:32,668 epoch 7 - iter 72/723 - loss 0.01058272 - time (sec): 5.72 - samples/sec: 3034.42 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 10:14:39,035 epoch 7 - iter 144/723 - loss 0.01503699 - time (sec): 12.09 - samples/sec: 2903.48 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 10:14:44,634 epoch 7 - iter 216/723 - loss 0.01870794 - time (sec): 17.68 - samples/sec: 2957.87 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 10:14:50,672 epoch 7 - iter 288/723 - loss 0.01832497 - time (sec): 23.72 - samples/sec: 2961.74 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 10:14:56,448 epoch 7 - iter 360/723 - loss 0.01843164 - time (sec): 29.50 - samples/sec: 2969.17 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 10:15:02,595 epoch 7 - iter 432/723 - loss 0.01926180 - time (sec): 35.65 - samples/sec: 2963.87 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 10:15:08,583 epoch 7 - iter 504/723 - loss 0.01935993 - time (sec): 41.63 - samples/sec: 2953.78 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 10:15:14,430 epoch 7 - iter 576/723 - loss 0.01890065 - time (sec): 47.48 - samples/sec: 2946.41 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 10:15:20,969 epoch 7 - iter 648/723 - loss 0.01861037 - time (sec): 54.02 - samples/sec: 2926.91 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 10:15:27,292 epoch 7 - iter 720/723 - loss 0.01854355 - time (sec): 60.34 - samples/sec: 2913.10 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 10:15:27,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:15:27,458 EPOCH 7 done: loss 0.0186 - lr: 0.000010 |
|
2023-10-14 10:15:30,986 DEV : loss 0.15326355397701263 - f1-score (micro avg) 0.8083 |
|
2023-10-14 10:15:31,002 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:15:37,084 epoch 8 - iter 72/723 - loss 0.00922642 - time (sec): 6.08 - samples/sec: 2974.53 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 10:15:43,173 epoch 8 - iter 144/723 - loss 0.00982421 - time (sec): 12.17 - samples/sec: 2890.61 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 10:15:50,181 epoch 8 - iter 216/723 - loss 0.01240116 - time (sec): 19.18 - samples/sec: 2858.17 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 10:15:55,387 epoch 8 - iter 288/723 - loss 0.01275710 - time (sec): 24.38 - samples/sec: 2850.31 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 10:16:01,729 epoch 8 - iter 360/723 - loss 0.01362184 - time (sec): 30.73 - samples/sec: 2872.18 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 10:16:07,858 epoch 8 - iter 432/723 - loss 0.01349404 - time (sec): 36.86 - samples/sec: 2895.21 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 10:16:13,356 epoch 8 - iter 504/723 - loss 0.01283957 - time (sec): 42.35 - samples/sec: 2923.66 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 10:16:19,110 epoch 8 - iter 576/723 - loss 0.01301832 - time (sec): 48.11 - samples/sec: 2930.74 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 10:16:25,266 epoch 8 - iter 648/723 - loss 0.01330593 - time (sec): 54.26 - samples/sec: 2929.25 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 10:16:31,073 epoch 8 - iter 720/723 - loss 0.01398724 - time (sec): 60.07 - samples/sec: 2925.95 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 10:16:31,266 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:16:31,266 EPOCH 8 done: loss 0.0140 - lr: 0.000007 |
|
2023-10-14 10:16:34,869 DEV : loss 0.1955159604549408 - f1-score (micro avg) 0.8047 |
|
2023-10-14 10:16:34,892 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:16:40,994 epoch 9 - iter 72/723 - loss 0.00962978 - time (sec): 6.10 - samples/sec: 2955.27 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 10:16:47,071 epoch 9 - iter 144/723 - loss 0.00819440 - time (sec): 12.18 - samples/sec: 2924.62 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 10:16:52,865 epoch 9 - iter 216/723 - loss 0.00783127 - time (sec): 17.97 - samples/sec: 2925.93 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 10:16:59,334 epoch 9 - iter 288/723 - loss 0.00813223 - time (sec): 24.44 - samples/sec: 2904.89 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 10:17:05,172 epoch 9 - iter 360/723 - loss 0.00940680 - time (sec): 30.28 - samples/sec: 2902.81 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 10:17:11,766 epoch 9 - iter 432/723 - loss 0.00956376 - time (sec): 36.87 - samples/sec: 2903.92 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 10:17:17,426 epoch 9 - iter 504/723 - loss 0.00940288 - time (sec): 42.53 - samples/sec: 2903.57 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 10:17:23,485 epoch 9 - iter 576/723 - loss 0.01068546 - time (sec): 48.59 - samples/sec: 2914.15 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 10:17:29,177 epoch 9 - iter 648/723 - loss 0.01130811 - time (sec): 54.28 - samples/sec: 2921.02 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 10:17:35,240 epoch 9 - iter 720/723 - loss 0.01166067 - time (sec): 60.35 - samples/sec: 2914.12 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 10:17:35,404 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:17:35,404 EPOCH 9 done: loss 0.0119 - lr: 0.000003 |
|
2023-10-14 10:17:39,827 DEV : loss 0.16821207106113434 - f1-score (micro avg) 0.8241 |
|
2023-10-14 10:17:39,847 saving best model |
|
2023-10-14 10:17:40,360 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:17:46,342 epoch 10 - iter 72/723 - loss 0.00286898 - time (sec): 5.97 - samples/sec: 2787.87 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 10:17:52,930 epoch 10 - iter 144/723 - loss 0.00425928 - time (sec): 12.56 - samples/sec: 2824.56 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 10:17:58,963 epoch 10 - iter 216/723 - loss 0.00904585 - time (sec): 18.60 - samples/sec: 2861.44 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 10:18:04,706 epoch 10 - iter 288/723 - loss 0.00863157 - time (sec): 24.34 - samples/sec: 2868.59 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 10:18:11,407 epoch 10 - iter 360/723 - loss 0.00931463 - time (sec): 31.04 - samples/sec: 2843.47 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 10:18:17,207 epoch 10 - iter 432/723 - loss 0.00842242 - time (sec): 36.84 - samples/sec: 2862.15 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 10:18:23,428 epoch 10 - iter 504/723 - loss 0.00864621 - time (sec): 43.06 - samples/sec: 2879.88 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 10:18:29,354 epoch 10 - iter 576/723 - loss 0.00805536 - time (sec): 48.99 - samples/sec: 2885.00 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 10:18:34,991 epoch 10 - iter 648/723 - loss 0.00772375 - time (sec): 54.62 - samples/sec: 2894.74 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-14 10:18:41,108 epoch 10 - iter 720/723 - loss 0.00777814 - time (sec): 60.74 - samples/sec: 2888.92 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-14 10:18:41,439 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:18:41,439 EPOCH 10 done: loss 0.0077 - lr: 0.000000 |
|
2023-10-14 10:18:45,046 DEV : loss 0.17490935325622559 - f1-score (micro avg) 0.8154 |
|
2023-10-14 10:18:45,487 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 10:18:45,489 Loading model from best epoch ... |
|
2023-10-14 10:18:47,257 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-14 10:18:50,812 |
|
Results: |
|
- F-score (micro) 0.8239 |
|
- F-score (macro) 0.7232 |
|
- Accuracy 0.712 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8532 0.8444 0.8488 482 |
|
LOC 0.8584 0.8472 0.8527 458 |
|
ORG 0.4583 0.4783 0.4681 69 |
|
|
|
micro avg 0.8272 0.8206 0.8239 1009 |
|
macro avg 0.7233 0.7233 0.7232 1009 |
|
weighted avg 0.8286 0.8206 0.8246 1009 |
|
|
|
2023-10-14 10:18:50,812 ---------------------------------------------------------------------------------------------------- |
|
|