stefan-it's picture
Upload folder using huggingface_hub
b232abb
raw
history blame
23.8 kB
2023-10-16 18:33:55,768 ----------------------------------------------------------------------------------------------------
2023-10-16 18:33:55,769 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 18:33:55,769 ----------------------------------------------------------------------------------------------------
2023-10-16 18:33:55,770 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-16 18:33:55,770 ----------------------------------------------------------------------------------------------------
2023-10-16 18:33:55,770 Train: 1166 sentences
2023-10-16 18:33:55,770 (train_with_dev=False, train_with_test=False)
2023-10-16 18:33:55,770 ----------------------------------------------------------------------------------------------------
2023-10-16 18:33:55,770 Training Params:
2023-10-16 18:33:55,770 - learning_rate: "3e-05"
2023-10-16 18:33:55,770 - mini_batch_size: "4"
2023-10-16 18:33:55,770 - max_epochs: "10"
2023-10-16 18:33:55,770 - shuffle: "True"
2023-10-16 18:33:55,770 ----------------------------------------------------------------------------------------------------
2023-10-16 18:33:55,770 Plugins:
2023-10-16 18:33:55,770 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 18:33:55,770 ----------------------------------------------------------------------------------------------------
2023-10-16 18:33:55,770 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 18:33:55,770 - metric: "('micro avg', 'f1-score')"
2023-10-16 18:33:55,770 ----------------------------------------------------------------------------------------------------
2023-10-16 18:33:55,770 Computation:
2023-10-16 18:33:55,770 - compute on device: cuda:0
2023-10-16 18:33:55,770 - embedding storage: none
2023-10-16 18:33:55,770 ----------------------------------------------------------------------------------------------------
2023-10-16 18:33:55,770 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-16 18:33:55,770 ----------------------------------------------------------------------------------------------------
2023-10-16 18:33:55,770 ----------------------------------------------------------------------------------------------------
2023-10-16 18:33:57,396 epoch 1 - iter 29/292 - loss 2.97811890 - time (sec): 1.62 - samples/sec: 2779.14 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:33:58,872 epoch 1 - iter 58/292 - loss 2.54035907 - time (sec): 3.10 - samples/sec: 2688.94 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:34:00,580 epoch 1 - iter 87/292 - loss 1.89046488 - time (sec): 4.81 - samples/sec: 2703.25 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:34:02,394 epoch 1 - iter 116/292 - loss 1.55026944 - time (sec): 6.62 - samples/sec: 2706.38 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:34:03,943 epoch 1 - iter 145/292 - loss 1.38005833 - time (sec): 8.17 - samples/sec: 2669.90 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:34:05,716 epoch 1 - iter 174/292 - loss 1.22687622 - time (sec): 9.94 - samples/sec: 2705.66 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:34:07,274 epoch 1 - iter 203/292 - loss 1.10559269 - time (sec): 11.50 - samples/sec: 2693.95 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:34:08,924 epoch 1 - iter 232/292 - loss 0.99171208 - time (sec): 13.15 - samples/sec: 2726.30 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:34:10,507 epoch 1 - iter 261/292 - loss 0.91903781 - time (sec): 14.74 - samples/sec: 2735.41 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:34:11,988 epoch 1 - iter 290/292 - loss 0.86127798 - time (sec): 16.22 - samples/sec: 2731.09 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:34:12,084 ----------------------------------------------------------------------------------------------------
2023-10-16 18:34:12,084 EPOCH 1 done: loss 0.8594 - lr: 0.000030
2023-10-16 18:34:13,262 DEV : loss 0.2028430998325348 - f1-score (micro avg) 0.4537
2023-10-16 18:34:13,267 saving best model
2023-10-16 18:34:13,617 ----------------------------------------------------------------------------------------------------
2023-10-16 18:34:15,292 epoch 2 - iter 29/292 - loss 0.26936582 - time (sec): 1.67 - samples/sec: 2624.97 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:34:17,194 epoch 2 - iter 58/292 - loss 0.27722925 - time (sec): 3.57 - samples/sec: 2779.08 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:34:18,959 epoch 2 - iter 87/292 - loss 0.26691371 - time (sec): 5.34 - samples/sec: 2671.42 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:34:20,698 epoch 2 - iter 116/292 - loss 0.24954932 - time (sec): 7.08 - samples/sec: 2675.17 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:34:22,186 epoch 2 - iter 145/292 - loss 0.23560297 - time (sec): 8.57 - samples/sec: 2672.15 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:34:23,808 epoch 2 - iter 174/292 - loss 0.22950433 - time (sec): 10.19 - samples/sec: 2698.00 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:34:25,265 epoch 2 - iter 203/292 - loss 0.22290494 - time (sec): 11.65 - samples/sec: 2692.17 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:34:26,822 epoch 2 - iter 232/292 - loss 0.21399971 - time (sec): 13.20 - samples/sec: 2687.84 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:34:28,522 epoch 2 - iter 261/292 - loss 0.20546782 - time (sec): 14.90 - samples/sec: 2677.06 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:34:30,088 epoch 2 - iter 290/292 - loss 0.19986350 - time (sec): 16.47 - samples/sec: 2672.05 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:34:30,201 ----------------------------------------------------------------------------------------------------
2023-10-16 18:34:30,201 EPOCH 2 done: loss 0.1981 - lr: 0.000027
2023-10-16 18:34:31,473 DEV : loss 0.15322378277778625 - f1-score (micro avg) 0.6482
2023-10-16 18:34:31,480 saving best model
2023-10-16 18:34:31,964 ----------------------------------------------------------------------------------------------------
2023-10-16 18:34:33,761 epoch 3 - iter 29/292 - loss 0.09096690 - time (sec): 1.79 - samples/sec: 2759.83 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:34:35,666 epoch 3 - iter 58/292 - loss 0.10310537 - time (sec): 3.70 - samples/sec: 2685.77 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:34:37,169 epoch 3 - iter 87/292 - loss 0.10661866 - time (sec): 5.20 - samples/sec: 2637.12 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:34:38,616 epoch 3 - iter 116/292 - loss 0.10383362 - time (sec): 6.65 - samples/sec: 2580.48 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:34:40,190 epoch 3 - iter 145/292 - loss 0.10510943 - time (sec): 8.22 - samples/sec: 2618.23 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:34:41,861 epoch 3 - iter 174/292 - loss 0.10016538 - time (sec): 9.89 - samples/sec: 2678.26 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:34:43,725 epoch 3 - iter 203/292 - loss 0.10786459 - time (sec): 11.76 - samples/sec: 2736.37 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:34:45,256 epoch 3 - iter 232/292 - loss 0.10663759 - time (sec): 13.29 - samples/sec: 2714.67 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:34:46,820 epoch 3 - iter 261/292 - loss 0.10807253 - time (sec): 14.85 - samples/sec: 2700.38 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:34:48,664 epoch 3 - iter 290/292 - loss 0.11099861 - time (sec): 16.70 - samples/sec: 2639.15 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:34:48,770 ----------------------------------------------------------------------------------------------------
2023-10-16 18:34:48,770 EPOCH 3 done: loss 0.1104 - lr: 0.000023
2023-10-16 18:34:50,024 DEV : loss 0.12740793824195862 - f1-score (micro avg) 0.6806
2023-10-16 18:34:50,031 saving best model
2023-10-16 18:34:50,504 ----------------------------------------------------------------------------------------------------
2023-10-16 18:34:52,267 epoch 4 - iter 29/292 - loss 0.07240283 - time (sec): 1.76 - samples/sec: 2936.93 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:34:53,979 epoch 4 - iter 58/292 - loss 0.08368012 - time (sec): 3.47 - samples/sec: 2761.94 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:34:55,684 epoch 4 - iter 87/292 - loss 0.07070759 - time (sec): 5.18 - samples/sec: 2782.71 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:34:57,090 epoch 4 - iter 116/292 - loss 0.07117669 - time (sec): 6.58 - samples/sec: 2705.55 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:34:58,812 epoch 4 - iter 145/292 - loss 0.07127745 - time (sec): 8.30 - samples/sec: 2722.52 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:35:00,544 epoch 4 - iter 174/292 - loss 0.07173224 - time (sec): 10.04 - samples/sec: 2682.01 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:35:02,092 epoch 4 - iter 203/292 - loss 0.07056668 - time (sec): 11.58 - samples/sec: 2652.69 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:35:03,990 epoch 4 - iter 232/292 - loss 0.06753009 - time (sec): 13.48 - samples/sec: 2688.22 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:35:05,532 epoch 4 - iter 261/292 - loss 0.07357699 - time (sec): 15.02 - samples/sec: 2677.96 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:35:07,100 epoch 4 - iter 290/292 - loss 0.07369514 - time (sec): 16.59 - samples/sec: 2664.83 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:35:07,196 ----------------------------------------------------------------------------------------------------
2023-10-16 18:35:07,196 EPOCH 4 done: loss 0.0738 - lr: 0.000020
2023-10-16 18:35:08,426 DEV : loss 0.11727064847946167 - f1-score (micro avg) 0.7393
2023-10-16 18:35:08,430 saving best model
2023-10-16 18:35:08,915 ----------------------------------------------------------------------------------------------------
2023-10-16 18:35:10,622 epoch 5 - iter 29/292 - loss 0.04380722 - time (sec): 1.70 - samples/sec: 2602.82 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:35:12,303 epoch 5 - iter 58/292 - loss 0.04146231 - time (sec): 3.38 - samples/sec: 2577.03 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:35:14,043 epoch 5 - iter 87/292 - loss 0.03652196 - time (sec): 5.12 - samples/sec: 2582.15 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:35:15,797 epoch 5 - iter 116/292 - loss 0.04326091 - time (sec): 6.88 - samples/sec: 2693.95 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:35:17,516 epoch 5 - iter 145/292 - loss 0.04237026 - time (sec): 8.60 - samples/sec: 2733.27 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:35:19,160 epoch 5 - iter 174/292 - loss 0.04347168 - time (sec): 10.24 - samples/sec: 2718.21 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:35:20,829 epoch 5 - iter 203/292 - loss 0.04408652 - time (sec): 11.91 - samples/sec: 2745.81 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:35:22,415 epoch 5 - iter 232/292 - loss 0.04665523 - time (sec): 13.50 - samples/sec: 2716.37 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:35:23,912 epoch 5 - iter 261/292 - loss 0.05132093 - time (sec): 14.99 - samples/sec: 2698.13 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:35:25,420 epoch 5 - iter 290/292 - loss 0.05212658 - time (sec): 16.50 - samples/sec: 2676.48 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:35:25,519 ----------------------------------------------------------------------------------------------------
2023-10-16 18:35:25,519 EPOCH 5 done: loss 0.0521 - lr: 0.000017
2023-10-16 18:35:26,837 DEV : loss 0.12913569808006287 - f1-score (micro avg) 0.7346
2023-10-16 18:35:26,842 ----------------------------------------------------------------------------------------------------
2023-10-16 18:35:28,457 epoch 6 - iter 29/292 - loss 0.04619303 - time (sec): 1.61 - samples/sec: 2734.91 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:35:30,170 epoch 6 - iter 58/292 - loss 0.04616009 - time (sec): 3.33 - samples/sec: 2791.21 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:35:31,746 epoch 6 - iter 87/292 - loss 0.03824377 - time (sec): 4.90 - samples/sec: 2769.71 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:35:33,366 epoch 6 - iter 116/292 - loss 0.04127051 - time (sec): 6.52 - samples/sec: 2722.40 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:35:34,997 epoch 6 - iter 145/292 - loss 0.03927787 - time (sec): 8.15 - samples/sec: 2714.87 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:35:36,590 epoch 6 - iter 174/292 - loss 0.03842080 - time (sec): 9.75 - samples/sec: 2673.25 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:35:38,235 epoch 6 - iter 203/292 - loss 0.03737113 - time (sec): 11.39 - samples/sec: 2695.94 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:35:39,896 epoch 6 - iter 232/292 - loss 0.03596931 - time (sec): 13.05 - samples/sec: 2718.61 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:35:41,551 epoch 6 - iter 261/292 - loss 0.03747061 - time (sec): 14.71 - samples/sec: 2714.82 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:35:43,293 epoch 6 - iter 290/292 - loss 0.03976872 - time (sec): 16.45 - samples/sec: 2690.23 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:35:43,391 ----------------------------------------------------------------------------------------------------
2023-10-16 18:35:43,391 EPOCH 6 done: loss 0.0396 - lr: 0.000013
2023-10-16 18:35:44,694 DEV : loss 0.1291522979736328 - f1-score (micro avg) 0.7653
2023-10-16 18:35:44,699 saving best model
2023-10-16 18:35:45,224 ----------------------------------------------------------------------------------------------------
2023-10-16 18:35:46,851 epoch 7 - iter 29/292 - loss 0.02619680 - time (sec): 1.63 - samples/sec: 2695.84 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:35:48,443 epoch 7 - iter 58/292 - loss 0.01856380 - time (sec): 3.22 - samples/sec: 2652.47 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:35:50,103 epoch 7 - iter 87/292 - loss 0.02119680 - time (sec): 4.88 - samples/sec: 2700.85 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:35:51,851 epoch 7 - iter 116/292 - loss 0.02361280 - time (sec): 6.63 - samples/sec: 2714.12 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:35:53,415 epoch 7 - iter 145/292 - loss 0.02219740 - time (sec): 8.19 - samples/sec: 2677.13 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:35:55,367 epoch 7 - iter 174/292 - loss 0.02704237 - time (sec): 10.14 - samples/sec: 2695.94 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:35:56,885 epoch 7 - iter 203/292 - loss 0.02612913 - time (sec): 11.66 - samples/sec: 2689.74 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:35:58,653 epoch 7 - iter 232/292 - loss 0.03100196 - time (sec): 13.43 - samples/sec: 2661.40 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:36:00,339 epoch 7 - iter 261/292 - loss 0.03288401 - time (sec): 15.11 - samples/sec: 2659.04 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:36:01,908 epoch 7 - iter 290/292 - loss 0.03146715 - time (sec): 16.68 - samples/sec: 2649.58 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:36:02,007 ----------------------------------------------------------------------------------------------------
2023-10-16 18:36:02,007 EPOCH 7 done: loss 0.0313 - lr: 0.000010
2023-10-16 18:36:03,679 DEV : loss 0.17999307811260223 - f1-score (micro avg) 0.7227
2023-10-16 18:36:03,684 ----------------------------------------------------------------------------------------------------
2023-10-16 18:36:05,287 epoch 8 - iter 29/292 - loss 0.01237339 - time (sec): 1.60 - samples/sec: 2758.23 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:36:07,003 epoch 8 - iter 58/292 - loss 0.01348390 - time (sec): 3.32 - samples/sec: 2791.09 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:36:08,872 epoch 8 - iter 87/292 - loss 0.02301764 - time (sec): 5.19 - samples/sec: 2752.84 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:36:10,361 epoch 8 - iter 116/292 - loss 0.02333317 - time (sec): 6.68 - samples/sec: 2652.61 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:36:12,248 epoch 8 - iter 145/292 - loss 0.02259095 - time (sec): 8.56 - samples/sec: 2710.38 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:36:13,794 epoch 8 - iter 174/292 - loss 0.02842056 - time (sec): 10.11 - samples/sec: 2707.82 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:36:15,438 epoch 8 - iter 203/292 - loss 0.02616189 - time (sec): 11.75 - samples/sec: 2681.02 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:36:17,108 epoch 8 - iter 232/292 - loss 0.02767555 - time (sec): 13.42 - samples/sec: 2691.88 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:36:18,772 epoch 8 - iter 261/292 - loss 0.02718671 - time (sec): 15.09 - samples/sec: 2694.67 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:36:20,190 epoch 8 - iter 290/292 - loss 0.02656034 - time (sec): 16.50 - samples/sec: 2681.20 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:36:20,275 ----------------------------------------------------------------------------------------------------
2023-10-16 18:36:20,275 EPOCH 8 done: loss 0.0265 - lr: 0.000007
2023-10-16 18:36:21,583 DEV : loss 0.16463765501976013 - f1-score (micro avg) 0.7474
2023-10-16 18:36:21,590 ----------------------------------------------------------------------------------------------------
2023-10-16 18:36:23,262 epoch 9 - iter 29/292 - loss 0.05507861 - time (sec): 1.67 - samples/sec: 2855.76 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:36:25,078 epoch 9 - iter 58/292 - loss 0.04162172 - time (sec): 3.49 - samples/sec: 2763.36 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:36:26,669 epoch 9 - iter 87/292 - loss 0.03075030 - time (sec): 5.08 - samples/sec: 2655.80 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:36:28,264 epoch 9 - iter 116/292 - loss 0.02666806 - time (sec): 6.67 - samples/sec: 2666.52 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:36:29,796 epoch 9 - iter 145/292 - loss 0.02435338 - time (sec): 8.20 - samples/sec: 2638.11 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:36:31,386 epoch 9 - iter 174/292 - loss 0.02409412 - time (sec): 9.79 - samples/sec: 2636.49 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:36:33,037 epoch 9 - iter 203/292 - loss 0.02235971 - time (sec): 11.45 - samples/sec: 2679.09 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:36:34,503 epoch 9 - iter 232/292 - loss 0.02212491 - time (sec): 12.91 - samples/sec: 2645.26 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:36:36,254 epoch 9 - iter 261/292 - loss 0.02037368 - time (sec): 14.66 - samples/sec: 2640.49 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:36:38,158 epoch 9 - iter 290/292 - loss 0.02139770 - time (sec): 16.57 - samples/sec: 2656.24 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:36:38,299 ----------------------------------------------------------------------------------------------------
2023-10-16 18:36:38,300 EPOCH 9 done: loss 0.0212 - lr: 0.000003
2023-10-16 18:36:39,571 DEV : loss 0.16017624735832214 - f1-score (micro avg) 0.7468
2023-10-16 18:36:39,576 ----------------------------------------------------------------------------------------------------
2023-10-16 18:36:41,110 epoch 10 - iter 29/292 - loss 0.02369239 - time (sec): 1.53 - samples/sec: 2527.67 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:36:42,656 epoch 10 - iter 58/292 - loss 0.01948950 - time (sec): 3.08 - samples/sec: 2656.11 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:36:44,376 epoch 10 - iter 87/292 - loss 0.01562727 - time (sec): 4.80 - samples/sec: 2696.13 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:36:46,289 epoch 10 - iter 116/292 - loss 0.01696371 - time (sec): 6.71 - samples/sec: 2709.47 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:36:47,832 epoch 10 - iter 145/292 - loss 0.01566586 - time (sec): 8.25 - samples/sec: 2697.78 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:36:49,603 epoch 10 - iter 174/292 - loss 0.01677469 - time (sec): 10.03 - samples/sec: 2705.42 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:36:51,097 epoch 10 - iter 203/292 - loss 0.01572230 - time (sec): 11.52 - samples/sec: 2698.95 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:36:52,729 epoch 10 - iter 232/292 - loss 0.01515656 - time (sec): 13.15 - samples/sec: 2696.30 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:36:54,455 epoch 10 - iter 261/292 - loss 0.01773537 - time (sec): 14.88 - samples/sec: 2725.51 - lr: 0.000000 - momentum: 0.000000
2023-10-16 18:36:55,964 epoch 10 - iter 290/292 - loss 0.01657736 - time (sec): 16.39 - samples/sec: 2704.03 - lr: 0.000000 - momentum: 0.000000
2023-10-16 18:36:56,045 ----------------------------------------------------------------------------------------------------
2023-10-16 18:36:56,045 EPOCH 10 done: loss 0.0165 - lr: 0.000000
2023-10-16 18:36:57,349 DEV : loss 0.16469089686870575 - f1-score (micro avg) 0.7484
2023-10-16 18:36:57,753 ----------------------------------------------------------------------------------------------------
2023-10-16 18:36:57,754 Loading model from best epoch ...
2023-10-16 18:36:59,284 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-16 18:37:01,644
Results:
- F-score (micro) 0.7497
- F-score (macro) 0.6428
- Accuracy 0.6187
By class:
precision recall f1-score support
PER 0.8348 0.8276 0.8312 348
LOC 0.6289 0.8506 0.7231 261
ORG 0.4054 0.2885 0.3371 52
HumanProd 0.6071 0.7727 0.6800 22
micro avg 0.7104 0.7936 0.7497 683
macro avg 0.6191 0.6848 0.6428 683
weighted avg 0.7161 0.7936 0.7474 683
2023-10-16 18:37:01,645 ----------------------------------------------------------------------------------------------------