|
2023-10-16 18:33:55,768 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:33:55,769 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 18:33:55,769 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:33:55,770 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-16 18:33:55,770 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:33:55,770 Train: 1166 sentences |
|
2023-10-16 18:33:55,770 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 18:33:55,770 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:33:55,770 Training Params: |
|
2023-10-16 18:33:55,770 - learning_rate: "3e-05" |
|
2023-10-16 18:33:55,770 - mini_batch_size: "4" |
|
2023-10-16 18:33:55,770 - max_epochs: "10" |
|
2023-10-16 18:33:55,770 - shuffle: "True" |
|
2023-10-16 18:33:55,770 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:33:55,770 Plugins: |
|
2023-10-16 18:33:55,770 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 18:33:55,770 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:33:55,770 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 18:33:55,770 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 18:33:55,770 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:33:55,770 Computation: |
|
2023-10-16 18:33:55,770 - compute on device: cuda:0 |
|
2023-10-16 18:33:55,770 - embedding storage: none |
|
2023-10-16 18:33:55,770 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:33:55,770 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-16 18:33:55,770 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:33:55,770 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:33:57,396 epoch 1 - iter 29/292 - loss 2.97811890 - time (sec): 1.62 - samples/sec: 2779.14 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:33:58,872 epoch 1 - iter 58/292 - loss 2.54035907 - time (sec): 3.10 - samples/sec: 2688.94 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:34:00,580 epoch 1 - iter 87/292 - loss 1.89046488 - time (sec): 4.81 - samples/sec: 2703.25 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:34:02,394 epoch 1 - iter 116/292 - loss 1.55026944 - time (sec): 6.62 - samples/sec: 2706.38 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:34:03,943 epoch 1 - iter 145/292 - loss 1.38005833 - time (sec): 8.17 - samples/sec: 2669.90 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:34:05,716 epoch 1 - iter 174/292 - loss 1.22687622 - time (sec): 9.94 - samples/sec: 2705.66 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:34:07,274 epoch 1 - iter 203/292 - loss 1.10559269 - time (sec): 11.50 - samples/sec: 2693.95 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:34:08,924 epoch 1 - iter 232/292 - loss 0.99171208 - time (sec): 13.15 - samples/sec: 2726.30 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:34:10,507 epoch 1 - iter 261/292 - loss 0.91903781 - time (sec): 14.74 - samples/sec: 2735.41 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:34:11,988 epoch 1 - iter 290/292 - loss 0.86127798 - time (sec): 16.22 - samples/sec: 2731.09 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:34:12,084 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:34:12,084 EPOCH 1 done: loss 0.8594 - lr: 0.000030 |
|
2023-10-16 18:34:13,262 DEV : loss 0.2028430998325348 - f1-score (micro avg) 0.4537 |
|
2023-10-16 18:34:13,267 saving best model |
|
2023-10-16 18:34:13,617 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:34:15,292 epoch 2 - iter 29/292 - loss 0.26936582 - time (sec): 1.67 - samples/sec: 2624.97 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:34:17,194 epoch 2 - iter 58/292 - loss 0.27722925 - time (sec): 3.57 - samples/sec: 2779.08 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:34:18,959 epoch 2 - iter 87/292 - loss 0.26691371 - time (sec): 5.34 - samples/sec: 2671.42 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:34:20,698 epoch 2 - iter 116/292 - loss 0.24954932 - time (sec): 7.08 - samples/sec: 2675.17 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:34:22,186 epoch 2 - iter 145/292 - loss 0.23560297 - time (sec): 8.57 - samples/sec: 2672.15 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:34:23,808 epoch 2 - iter 174/292 - loss 0.22950433 - time (sec): 10.19 - samples/sec: 2698.00 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:34:25,265 epoch 2 - iter 203/292 - loss 0.22290494 - time (sec): 11.65 - samples/sec: 2692.17 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:34:26,822 epoch 2 - iter 232/292 - loss 0.21399971 - time (sec): 13.20 - samples/sec: 2687.84 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:34:28,522 epoch 2 - iter 261/292 - loss 0.20546782 - time (sec): 14.90 - samples/sec: 2677.06 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:34:30,088 epoch 2 - iter 290/292 - loss 0.19986350 - time (sec): 16.47 - samples/sec: 2672.05 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:34:30,201 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:34:30,201 EPOCH 2 done: loss 0.1981 - lr: 0.000027 |
|
2023-10-16 18:34:31,473 DEV : loss 0.15322378277778625 - f1-score (micro avg) 0.6482 |
|
2023-10-16 18:34:31,480 saving best model |
|
2023-10-16 18:34:31,964 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:34:33,761 epoch 3 - iter 29/292 - loss 0.09096690 - time (sec): 1.79 - samples/sec: 2759.83 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:34:35,666 epoch 3 - iter 58/292 - loss 0.10310537 - time (sec): 3.70 - samples/sec: 2685.77 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:34:37,169 epoch 3 - iter 87/292 - loss 0.10661866 - time (sec): 5.20 - samples/sec: 2637.12 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:34:38,616 epoch 3 - iter 116/292 - loss 0.10383362 - time (sec): 6.65 - samples/sec: 2580.48 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:34:40,190 epoch 3 - iter 145/292 - loss 0.10510943 - time (sec): 8.22 - samples/sec: 2618.23 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:34:41,861 epoch 3 - iter 174/292 - loss 0.10016538 - time (sec): 9.89 - samples/sec: 2678.26 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:34:43,725 epoch 3 - iter 203/292 - loss 0.10786459 - time (sec): 11.76 - samples/sec: 2736.37 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:34:45,256 epoch 3 - iter 232/292 - loss 0.10663759 - time (sec): 13.29 - samples/sec: 2714.67 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:34:46,820 epoch 3 - iter 261/292 - loss 0.10807253 - time (sec): 14.85 - samples/sec: 2700.38 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:34:48,664 epoch 3 - iter 290/292 - loss 0.11099861 - time (sec): 16.70 - samples/sec: 2639.15 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:34:48,770 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:34:48,770 EPOCH 3 done: loss 0.1104 - lr: 0.000023 |
|
2023-10-16 18:34:50,024 DEV : loss 0.12740793824195862 - f1-score (micro avg) 0.6806 |
|
2023-10-16 18:34:50,031 saving best model |
|
2023-10-16 18:34:50,504 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:34:52,267 epoch 4 - iter 29/292 - loss 0.07240283 - time (sec): 1.76 - samples/sec: 2936.93 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:34:53,979 epoch 4 - iter 58/292 - loss 0.08368012 - time (sec): 3.47 - samples/sec: 2761.94 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:34:55,684 epoch 4 - iter 87/292 - loss 0.07070759 - time (sec): 5.18 - samples/sec: 2782.71 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:34:57,090 epoch 4 - iter 116/292 - loss 0.07117669 - time (sec): 6.58 - samples/sec: 2705.55 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:34:58,812 epoch 4 - iter 145/292 - loss 0.07127745 - time (sec): 8.30 - samples/sec: 2722.52 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:35:00,544 epoch 4 - iter 174/292 - loss 0.07173224 - time (sec): 10.04 - samples/sec: 2682.01 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:35:02,092 epoch 4 - iter 203/292 - loss 0.07056668 - time (sec): 11.58 - samples/sec: 2652.69 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:35:03,990 epoch 4 - iter 232/292 - loss 0.06753009 - time (sec): 13.48 - samples/sec: 2688.22 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:35:05,532 epoch 4 - iter 261/292 - loss 0.07357699 - time (sec): 15.02 - samples/sec: 2677.96 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:35:07,100 epoch 4 - iter 290/292 - loss 0.07369514 - time (sec): 16.59 - samples/sec: 2664.83 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:35:07,196 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:35:07,196 EPOCH 4 done: loss 0.0738 - lr: 0.000020 |
|
2023-10-16 18:35:08,426 DEV : loss 0.11727064847946167 - f1-score (micro avg) 0.7393 |
|
2023-10-16 18:35:08,430 saving best model |
|
2023-10-16 18:35:08,915 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:35:10,622 epoch 5 - iter 29/292 - loss 0.04380722 - time (sec): 1.70 - samples/sec: 2602.82 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:35:12,303 epoch 5 - iter 58/292 - loss 0.04146231 - time (sec): 3.38 - samples/sec: 2577.03 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:35:14,043 epoch 5 - iter 87/292 - loss 0.03652196 - time (sec): 5.12 - samples/sec: 2582.15 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:35:15,797 epoch 5 - iter 116/292 - loss 0.04326091 - time (sec): 6.88 - samples/sec: 2693.95 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:35:17,516 epoch 5 - iter 145/292 - loss 0.04237026 - time (sec): 8.60 - samples/sec: 2733.27 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:35:19,160 epoch 5 - iter 174/292 - loss 0.04347168 - time (sec): 10.24 - samples/sec: 2718.21 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:35:20,829 epoch 5 - iter 203/292 - loss 0.04408652 - time (sec): 11.91 - samples/sec: 2745.81 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:35:22,415 epoch 5 - iter 232/292 - loss 0.04665523 - time (sec): 13.50 - samples/sec: 2716.37 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:35:23,912 epoch 5 - iter 261/292 - loss 0.05132093 - time (sec): 14.99 - samples/sec: 2698.13 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:35:25,420 epoch 5 - iter 290/292 - loss 0.05212658 - time (sec): 16.50 - samples/sec: 2676.48 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:35:25,519 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:35:25,519 EPOCH 5 done: loss 0.0521 - lr: 0.000017 |
|
2023-10-16 18:35:26,837 DEV : loss 0.12913569808006287 - f1-score (micro avg) 0.7346 |
|
2023-10-16 18:35:26,842 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:35:28,457 epoch 6 - iter 29/292 - loss 0.04619303 - time (sec): 1.61 - samples/sec: 2734.91 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:35:30,170 epoch 6 - iter 58/292 - loss 0.04616009 - time (sec): 3.33 - samples/sec: 2791.21 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:35:31,746 epoch 6 - iter 87/292 - loss 0.03824377 - time (sec): 4.90 - samples/sec: 2769.71 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:35:33,366 epoch 6 - iter 116/292 - loss 0.04127051 - time (sec): 6.52 - samples/sec: 2722.40 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:35:34,997 epoch 6 - iter 145/292 - loss 0.03927787 - time (sec): 8.15 - samples/sec: 2714.87 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:35:36,590 epoch 6 - iter 174/292 - loss 0.03842080 - time (sec): 9.75 - samples/sec: 2673.25 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:35:38,235 epoch 6 - iter 203/292 - loss 0.03737113 - time (sec): 11.39 - samples/sec: 2695.94 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:35:39,896 epoch 6 - iter 232/292 - loss 0.03596931 - time (sec): 13.05 - samples/sec: 2718.61 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:35:41,551 epoch 6 - iter 261/292 - loss 0.03747061 - time (sec): 14.71 - samples/sec: 2714.82 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:35:43,293 epoch 6 - iter 290/292 - loss 0.03976872 - time (sec): 16.45 - samples/sec: 2690.23 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:35:43,391 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:35:43,391 EPOCH 6 done: loss 0.0396 - lr: 0.000013 |
|
2023-10-16 18:35:44,694 DEV : loss 0.1291522979736328 - f1-score (micro avg) 0.7653 |
|
2023-10-16 18:35:44,699 saving best model |
|
2023-10-16 18:35:45,224 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:35:46,851 epoch 7 - iter 29/292 - loss 0.02619680 - time (sec): 1.63 - samples/sec: 2695.84 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:35:48,443 epoch 7 - iter 58/292 - loss 0.01856380 - time (sec): 3.22 - samples/sec: 2652.47 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:35:50,103 epoch 7 - iter 87/292 - loss 0.02119680 - time (sec): 4.88 - samples/sec: 2700.85 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:35:51,851 epoch 7 - iter 116/292 - loss 0.02361280 - time (sec): 6.63 - samples/sec: 2714.12 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:35:53,415 epoch 7 - iter 145/292 - loss 0.02219740 - time (sec): 8.19 - samples/sec: 2677.13 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:35:55,367 epoch 7 - iter 174/292 - loss 0.02704237 - time (sec): 10.14 - samples/sec: 2695.94 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:35:56,885 epoch 7 - iter 203/292 - loss 0.02612913 - time (sec): 11.66 - samples/sec: 2689.74 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:35:58,653 epoch 7 - iter 232/292 - loss 0.03100196 - time (sec): 13.43 - samples/sec: 2661.40 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:36:00,339 epoch 7 - iter 261/292 - loss 0.03288401 - time (sec): 15.11 - samples/sec: 2659.04 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:36:01,908 epoch 7 - iter 290/292 - loss 0.03146715 - time (sec): 16.68 - samples/sec: 2649.58 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:36:02,007 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:36:02,007 EPOCH 7 done: loss 0.0313 - lr: 0.000010 |
|
2023-10-16 18:36:03,679 DEV : loss 0.17999307811260223 - f1-score (micro avg) 0.7227 |
|
2023-10-16 18:36:03,684 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:36:05,287 epoch 8 - iter 29/292 - loss 0.01237339 - time (sec): 1.60 - samples/sec: 2758.23 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:36:07,003 epoch 8 - iter 58/292 - loss 0.01348390 - time (sec): 3.32 - samples/sec: 2791.09 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:36:08,872 epoch 8 - iter 87/292 - loss 0.02301764 - time (sec): 5.19 - samples/sec: 2752.84 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:36:10,361 epoch 8 - iter 116/292 - loss 0.02333317 - time (sec): 6.68 - samples/sec: 2652.61 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:36:12,248 epoch 8 - iter 145/292 - loss 0.02259095 - time (sec): 8.56 - samples/sec: 2710.38 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:36:13,794 epoch 8 - iter 174/292 - loss 0.02842056 - time (sec): 10.11 - samples/sec: 2707.82 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:36:15,438 epoch 8 - iter 203/292 - loss 0.02616189 - time (sec): 11.75 - samples/sec: 2681.02 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:36:17,108 epoch 8 - iter 232/292 - loss 0.02767555 - time (sec): 13.42 - samples/sec: 2691.88 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:36:18,772 epoch 8 - iter 261/292 - loss 0.02718671 - time (sec): 15.09 - samples/sec: 2694.67 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:36:20,190 epoch 8 - iter 290/292 - loss 0.02656034 - time (sec): 16.50 - samples/sec: 2681.20 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:36:20,275 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:36:20,275 EPOCH 8 done: loss 0.0265 - lr: 0.000007 |
|
2023-10-16 18:36:21,583 DEV : loss 0.16463765501976013 - f1-score (micro avg) 0.7474 |
|
2023-10-16 18:36:21,590 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:36:23,262 epoch 9 - iter 29/292 - loss 0.05507861 - time (sec): 1.67 - samples/sec: 2855.76 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:36:25,078 epoch 9 - iter 58/292 - loss 0.04162172 - time (sec): 3.49 - samples/sec: 2763.36 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:36:26,669 epoch 9 - iter 87/292 - loss 0.03075030 - time (sec): 5.08 - samples/sec: 2655.80 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:36:28,264 epoch 9 - iter 116/292 - loss 0.02666806 - time (sec): 6.67 - samples/sec: 2666.52 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:36:29,796 epoch 9 - iter 145/292 - loss 0.02435338 - time (sec): 8.20 - samples/sec: 2638.11 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:36:31,386 epoch 9 - iter 174/292 - loss 0.02409412 - time (sec): 9.79 - samples/sec: 2636.49 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:36:33,037 epoch 9 - iter 203/292 - loss 0.02235971 - time (sec): 11.45 - samples/sec: 2679.09 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:36:34,503 epoch 9 - iter 232/292 - loss 0.02212491 - time (sec): 12.91 - samples/sec: 2645.26 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:36:36,254 epoch 9 - iter 261/292 - loss 0.02037368 - time (sec): 14.66 - samples/sec: 2640.49 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:36:38,158 epoch 9 - iter 290/292 - loss 0.02139770 - time (sec): 16.57 - samples/sec: 2656.24 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:36:38,299 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:36:38,300 EPOCH 9 done: loss 0.0212 - lr: 0.000003 |
|
2023-10-16 18:36:39,571 DEV : loss 0.16017624735832214 - f1-score (micro avg) 0.7468 |
|
2023-10-16 18:36:39,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:36:41,110 epoch 10 - iter 29/292 - loss 0.02369239 - time (sec): 1.53 - samples/sec: 2527.67 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:36:42,656 epoch 10 - iter 58/292 - loss 0.01948950 - time (sec): 3.08 - samples/sec: 2656.11 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:36:44,376 epoch 10 - iter 87/292 - loss 0.01562727 - time (sec): 4.80 - samples/sec: 2696.13 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:36:46,289 epoch 10 - iter 116/292 - loss 0.01696371 - time (sec): 6.71 - samples/sec: 2709.47 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:36:47,832 epoch 10 - iter 145/292 - loss 0.01566586 - time (sec): 8.25 - samples/sec: 2697.78 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:36:49,603 epoch 10 - iter 174/292 - loss 0.01677469 - time (sec): 10.03 - samples/sec: 2705.42 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:36:51,097 epoch 10 - iter 203/292 - loss 0.01572230 - time (sec): 11.52 - samples/sec: 2698.95 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:36:52,729 epoch 10 - iter 232/292 - loss 0.01515656 - time (sec): 13.15 - samples/sec: 2696.30 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:36:54,455 epoch 10 - iter 261/292 - loss 0.01773537 - time (sec): 14.88 - samples/sec: 2725.51 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 18:36:55,964 epoch 10 - iter 290/292 - loss 0.01657736 - time (sec): 16.39 - samples/sec: 2704.03 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 18:36:56,045 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:36:56,045 EPOCH 10 done: loss 0.0165 - lr: 0.000000 |
|
2023-10-16 18:36:57,349 DEV : loss 0.16469089686870575 - f1-score (micro avg) 0.7484 |
|
2023-10-16 18:36:57,753 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:36:57,754 Loading model from best epoch ... |
|
2023-10-16 18:36:59,284 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 18:37:01,644 |
|
Results: |
|
- F-score (micro) 0.7497 |
|
- F-score (macro) 0.6428 |
|
- Accuracy 0.6187 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8348 0.8276 0.8312 348 |
|
LOC 0.6289 0.8506 0.7231 261 |
|
ORG 0.4054 0.2885 0.3371 52 |
|
HumanProd 0.6071 0.7727 0.6800 22 |
|
|
|
micro avg 0.7104 0.7936 0.7497 683 |
|
macro avg 0.6191 0.6848 0.6428 683 |
|
weighted avg 0.7161 0.7936 0.7474 683 |
|
|
|
2023-10-16 18:37:01,645 ---------------------------------------------------------------------------------------------------- |
|
|