stefan-it's picture
Upload ./training.log with huggingface_hub
338730b
2023-10-25 14:33:10,198 ----------------------------------------------------------------------------------------------------
2023-10-25 14:33:10,199 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 14:33:10,199 ----------------------------------------------------------------------------------------------------
2023-10-25 14:33:10,199 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-25 14:33:10,199 ----------------------------------------------------------------------------------------------------
2023-10-25 14:33:10,199 Train: 7142 sentences
2023-10-25 14:33:10,199 (train_with_dev=False, train_with_test=False)
2023-10-25 14:33:10,199 ----------------------------------------------------------------------------------------------------
2023-10-25 14:33:10,199 Training Params:
2023-10-25 14:33:10,199 - learning_rate: "3e-05"
2023-10-25 14:33:10,199 - mini_batch_size: "4"
2023-10-25 14:33:10,199 - max_epochs: "10"
2023-10-25 14:33:10,199 - shuffle: "True"
2023-10-25 14:33:10,199 ----------------------------------------------------------------------------------------------------
2023-10-25 14:33:10,199 Plugins:
2023-10-25 14:33:10,199 - TensorboardLogger
2023-10-25 14:33:10,199 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 14:33:10,199 ----------------------------------------------------------------------------------------------------
2023-10-25 14:33:10,200 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 14:33:10,200 - metric: "('micro avg', 'f1-score')"
2023-10-25 14:33:10,200 ----------------------------------------------------------------------------------------------------
2023-10-25 14:33:10,200 Computation:
2023-10-25 14:33:10,200 - compute on device: cuda:0
2023-10-25 14:33:10,200 - embedding storage: none
2023-10-25 14:33:10,200 ----------------------------------------------------------------------------------------------------
2023-10-25 14:33:10,200 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-25 14:33:10,200 ----------------------------------------------------------------------------------------------------
2023-10-25 14:33:10,200 ----------------------------------------------------------------------------------------------------
2023-10-25 14:33:10,200 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 14:33:19,310 epoch 1 - iter 178/1786 - loss 2.16458401 - time (sec): 9.11 - samples/sec: 2742.31 - lr: 0.000003 - momentum: 0.000000
2023-10-25 14:33:28,441 epoch 1 - iter 356/1786 - loss 1.31146865 - time (sec): 18.24 - samples/sec: 2768.14 - lr: 0.000006 - momentum: 0.000000
2023-10-25 14:33:37,814 epoch 1 - iter 534/1786 - loss 0.98905009 - time (sec): 27.61 - samples/sec: 2714.22 - lr: 0.000009 - momentum: 0.000000
2023-10-25 14:33:47,347 epoch 1 - iter 712/1786 - loss 0.80131301 - time (sec): 37.15 - samples/sec: 2700.70 - lr: 0.000012 - momentum: 0.000000
2023-10-25 14:33:56,710 epoch 1 - iter 890/1786 - loss 0.68410190 - time (sec): 46.51 - samples/sec: 2690.07 - lr: 0.000015 - momentum: 0.000000
2023-10-25 14:34:06,145 epoch 1 - iter 1068/1786 - loss 0.60428359 - time (sec): 55.94 - samples/sec: 2664.84 - lr: 0.000018 - momentum: 0.000000
2023-10-25 14:34:15,401 epoch 1 - iter 1246/1786 - loss 0.53636662 - time (sec): 65.20 - samples/sec: 2688.52 - lr: 0.000021 - momentum: 0.000000
2023-10-25 14:34:24,423 epoch 1 - iter 1424/1786 - loss 0.49020486 - time (sec): 74.22 - samples/sec: 2680.72 - lr: 0.000024 - momentum: 0.000000
2023-10-25 14:34:33,280 epoch 1 - iter 1602/1786 - loss 0.45227059 - time (sec): 83.08 - samples/sec: 2690.51 - lr: 0.000027 - momentum: 0.000000
2023-10-25 14:34:42,376 epoch 1 - iter 1780/1786 - loss 0.42340790 - time (sec): 92.18 - samples/sec: 2688.93 - lr: 0.000030 - momentum: 0.000000
2023-10-25 14:34:42,696 ----------------------------------------------------------------------------------------------------
2023-10-25 14:34:42,697 EPOCH 1 done: loss 0.4222 - lr: 0.000030
2023-10-25 14:34:46,074 DEV : loss 0.10848163068294525 - f1-score (micro avg) 0.7428
2023-10-25 14:34:46,097 saving best model
2023-10-25 14:34:46,612 ----------------------------------------------------------------------------------------------------
2023-10-25 14:34:55,533 epoch 2 - iter 178/1786 - loss 0.11053417 - time (sec): 8.92 - samples/sec: 2766.89 - lr: 0.000030 - momentum: 0.000000
2023-10-25 14:35:04,481 epoch 2 - iter 356/1786 - loss 0.11490130 - time (sec): 17.87 - samples/sec: 2767.94 - lr: 0.000029 - momentum: 0.000000
2023-10-25 14:35:13,392 epoch 2 - iter 534/1786 - loss 0.11620114 - time (sec): 26.78 - samples/sec: 2743.47 - lr: 0.000029 - momentum: 0.000000
2023-10-25 14:35:22,226 epoch 2 - iter 712/1786 - loss 0.11705674 - time (sec): 35.61 - samples/sec: 2754.07 - lr: 0.000029 - momentum: 0.000000
2023-10-25 14:35:31,022 epoch 2 - iter 890/1786 - loss 0.11188865 - time (sec): 44.41 - samples/sec: 2756.31 - lr: 0.000028 - momentum: 0.000000
2023-10-25 14:35:40,149 epoch 2 - iter 1068/1786 - loss 0.11149373 - time (sec): 53.53 - samples/sec: 2737.48 - lr: 0.000028 - momentum: 0.000000
2023-10-25 14:35:49,571 epoch 2 - iter 1246/1786 - loss 0.11380784 - time (sec): 62.96 - samples/sec: 2733.71 - lr: 0.000028 - momentum: 0.000000
2023-10-25 14:35:58,760 epoch 2 - iter 1424/1786 - loss 0.11159461 - time (sec): 72.15 - samples/sec: 2745.59 - lr: 0.000027 - momentum: 0.000000
2023-10-25 14:36:07,970 epoch 2 - iter 1602/1786 - loss 0.10992030 - time (sec): 81.36 - samples/sec: 2753.77 - lr: 0.000027 - momentum: 0.000000
2023-10-25 14:36:17,118 epoch 2 - iter 1780/1786 - loss 0.10952342 - time (sec): 90.50 - samples/sec: 2739.06 - lr: 0.000027 - momentum: 0.000000
2023-10-25 14:36:17,436 ----------------------------------------------------------------------------------------------------
2023-10-25 14:36:17,436 EPOCH 2 done: loss 0.1093 - lr: 0.000027
2023-10-25 14:36:22,069 DEV : loss 0.1350429803133011 - f1-score (micro avg) 0.782
2023-10-25 14:36:22,090 saving best model
2023-10-25 14:36:22,804 ----------------------------------------------------------------------------------------------------
2023-10-25 14:36:31,752 epoch 3 - iter 178/1786 - loss 0.06695239 - time (sec): 8.94 - samples/sec: 2747.53 - lr: 0.000026 - momentum: 0.000000
2023-10-25 14:36:40,887 epoch 3 - iter 356/1786 - loss 0.07437336 - time (sec): 18.08 - samples/sec: 2833.12 - lr: 0.000026 - momentum: 0.000000
2023-10-25 14:36:49,890 epoch 3 - iter 534/1786 - loss 0.07186971 - time (sec): 27.08 - samples/sec: 2785.30 - lr: 0.000026 - momentum: 0.000000
2023-10-25 14:36:58,844 epoch 3 - iter 712/1786 - loss 0.07348943 - time (sec): 36.04 - samples/sec: 2761.89 - lr: 0.000025 - momentum: 0.000000
2023-10-25 14:37:07,815 epoch 3 - iter 890/1786 - loss 0.07319247 - time (sec): 45.01 - samples/sec: 2735.26 - lr: 0.000025 - momentum: 0.000000
2023-10-25 14:37:17,039 epoch 3 - iter 1068/1786 - loss 0.07440492 - time (sec): 54.23 - samples/sec: 2728.33 - lr: 0.000025 - momentum: 0.000000
2023-10-25 14:37:25,994 epoch 3 - iter 1246/1786 - loss 0.07390108 - time (sec): 63.19 - samples/sec: 2748.86 - lr: 0.000024 - momentum: 0.000000
2023-10-25 14:37:34,720 epoch 3 - iter 1424/1786 - loss 0.07488194 - time (sec): 71.91 - samples/sec: 2763.21 - lr: 0.000024 - momentum: 0.000000
2023-10-25 14:37:43,623 epoch 3 - iter 1602/1786 - loss 0.07447046 - time (sec): 80.82 - samples/sec: 2757.27 - lr: 0.000024 - momentum: 0.000000
2023-10-25 14:37:52,704 epoch 3 - iter 1780/1786 - loss 0.07537153 - time (sec): 89.90 - samples/sec: 2760.28 - lr: 0.000023 - momentum: 0.000000
2023-10-25 14:37:53,003 ----------------------------------------------------------------------------------------------------
2023-10-25 14:37:53,003 EPOCH 3 done: loss 0.0753 - lr: 0.000023
2023-10-25 14:37:57,933 DEV : loss 0.13353045284748077 - f1-score (micro avg) 0.7897
2023-10-25 14:37:57,956 saving best model
2023-10-25 14:37:58,732 ----------------------------------------------------------------------------------------------------
2023-10-25 14:38:08,143 epoch 4 - iter 178/1786 - loss 0.05605084 - time (sec): 9.41 - samples/sec: 2743.99 - lr: 0.000023 - momentum: 0.000000
2023-10-25 14:38:17,438 epoch 4 - iter 356/1786 - loss 0.05402125 - time (sec): 18.70 - samples/sec: 2672.51 - lr: 0.000023 - momentum: 0.000000
2023-10-25 14:38:26,525 epoch 4 - iter 534/1786 - loss 0.05499568 - time (sec): 27.79 - samples/sec: 2651.13 - lr: 0.000022 - momentum: 0.000000
2023-10-25 14:38:35,548 epoch 4 - iter 712/1786 - loss 0.05241855 - time (sec): 36.81 - samples/sec: 2711.98 - lr: 0.000022 - momentum: 0.000000
2023-10-25 14:38:44,562 epoch 4 - iter 890/1786 - loss 0.05240254 - time (sec): 45.83 - samples/sec: 2735.61 - lr: 0.000022 - momentum: 0.000000
2023-10-25 14:38:53,859 epoch 4 - iter 1068/1786 - loss 0.05285922 - time (sec): 55.12 - samples/sec: 2707.20 - lr: 0.000021 - momentum: 0.000000
2023-10-25 14:39:03,020 epoch 4 - iter 1246/1786 - loss 0.05189605 - time (sec): 64.28 - samples/sec: 2701.14 - lr: 0.000021 - momentum: 0.000000
2023-10-25 14:39:12,115 epoch 4 - iter 1424/1786 - loss 0.05211141 - time (sec): 73.38 - samples/sec: 2702.33 - lr: 0.000021 - momentum: 0.000000
2023-10-25 14:39:21,412 epoch 4 - iter 1602/1786 - loss 0.05237397 - time (sec): 82.68 - samples/sec: 2709.90 - lr: 0.000020 - momentum: 0.000000
2023-10-25 14:39:30,629 epoch 4 - iter 1780/1786 - loss 0.05151436 - time (sec): 91.89 - samples/sec: 2698.68 - lr: 0.000020 - momentum: 0.000000
2023-10-25 14:39:30,954 ----------------------------------------------------------------------------------------------------
2023-10-25 14:39:30,955 EPOCH 4 done: loss 0.0518 - lr: 0.000020
2023-10-25 14:39:34,712 DEV : loss 0.1849958300590515 - f1-score (micro avg) 0.7773
2023-10-25 14:39:34,734 ----------------------------------------------------------------------------------------------------
2023-10-25 14:39:44,192 epoch 5 - iter 178/1786 - loss 0.04597938 - time (sec): 9.46 - samples/sec: 2793.42 - lr: 0.000020 - momentum: 0.000000
2023-10-25 14:39:53,753 epoch 5 - iter 356/1786 - loss 0.04466867 - time (sec): 19.02 - samples/sec: 2708.97 - lr: 0.000019 - momentum: 0.000000
2023-10-25 14:40:03,388 epoch 5 - iter 534/1786 - loss 0.04058169 - time (sec): 28.65 - samples/sec: 2664.94 - lr: 0.000019 - momentum: 0.000000
2023-10-25 14:40:13,272 epoch 5 - iter 712/1786 - loss 0.04127086 - time (sec): 38.54 - samples/sec: 2597.73 - lr: 0.000019 - momentum: 0.000000
2023-10-25 14:40:22,719 epoch 5 - iter 890/1786 - loss 0.04095590 - time (sec): 47.98 - samples/sec: 2566.83 - lr: 0.000018 - momentum: 0.000000
2023-10-25 14:40:32,011 epoch 5 - iter 1068/1786 - loss 0.04113038 - time (sec): 57.27 - samples/sec: 2571.21 - lr: 0.000018 - momentum: 0.000000
2023-10-25 14:40:41,479 epoch 5 - iter 1246/1786 - loss 0.04188631 - time (sec): 66.74 - samples/sec: 2550.79 - lr: 0.000018 - momentum: 0.000000
2023-10-25 14:40:50,828 epoch 5 - iter 1424/1786 - loss 0.04177759 - time (sec): 76.09 - samples/sec: 2595.45 - lr: 0.000017 - momentum: 0.000000
2023-10-25 14:40:59,684 epoch 5 - iter 1602/1786 - loss 0.04071695 - time (sec): 84.95 - samples/sec: 2627.35 - lr: 0.000017 - momentum: 0.000000
2023-10-25 14:41:08,330 epoch 5 - iter 1780/1786 - loss 0.04018851 - time (sec): 93.59 - samples/sec: 2649.76 - lr: 0.000017 - momentum: 0.000000
2023-10-25 14:41:08,618 ----------------------------------------------------------------------------------------------------
2023-10-25 14:41:08,619 EPOCH 5 done: loss 0.0401 - lr: 0.000017
2023-10-25 14:41:13,493 DEV : loss 0.19509676098823547 - f1-score (micro avg) 0.7897
2023-10-25 14:41:13,516 saving best model
2023-10-25 14:41:14,286 ----------------------------------------------------------------------------------------------------
2023-10-25 14:41:23,859 epoch 6 - iter 178/1786 - loss 0.02929917 - time (sec): 9.57 - samples/sec: 2793.46 - lr: 0.000016 - momentum: 0.000000
2023-10-25 14:41:33,563 epoch 6 - iter 356/1786 - loss 0.03057329 - time (sec): 19.27 - samples/sec: 2683.12 - lr: 0.000016 - momentum: 0.000000
2023-10-25 14:41:43,148 epoch 6 - iter 534/1786 - loss 0.03026210 - time (sec): 28.86 - samples/sec: 2613.08 - lr: 0.000016 - momentum: 0.000000
2023-10-25 14:41:52,622 epoch 6 - iter 712/1786 - loss 0.03139423 - time (sec): 38.33 - samples/sec: 2649.33 - lr: 0.000015 - momentum: 0.000000
2023-10-25 14:42:02,141 epoch 6 - iter 890/1786 - loss 0.03141974 - time (sec): 47.85 - samples/sec: 2637.05 - lr: 0.000015 - momentum: 0.000000
2023-10-25 14:42:11,399 epoch 6 - iter 1068/1786 - loss 0.03227820 - time (sec): 57.11 - samples/sec: 2644.91 - lr: 0.000015 - momentum: 0.000000
2023-10-25 14:42:20,262 epoch 6 - iter 1246/1786 - loss 0.03178910 - time (sec): 65.97 - samples/sec: 2655.85 - lr: 0.000014 - momentum: 0.000000
2023-10-25 14:42:29,099 epoch 6 - iter 1424/1786 - loss 0.03207149 - time (sec): 74.81 - samples/sec: 2668.25 - lr: 0.000014 - momentum: 0.000000
2023-10-25 14:42:37,964 epoch 6 - iter 1602/1786 - loss 0.03149622 - time (sec): 83.67 - samples/sec: 2688.13 - lr: 0.000014 - momentum: 0.000000
2023-10-25 14:42:46,897 epoch 6 - iter 1780/1786 - loss 0.03192675 - time (sec): 92.61 - samples/sec: 2677.94 - lr: 0.000013 - momentum: 0.000000
2023-10-25 14:42:47,185 ----------------------------------------------------------------------------------------------------
2023-10-25 14:42:47,186 EPOCH 6 done: loss 0.0318 - lr: 0.000013
2023-10-25 14:42:51,967 DEV : loss 0.20672442018985748 - f1-score (micro avg) 0.7962
2023-10-25 14:42:51,988 saving best model
2023-10-25 14:42:52,693 ----------------------------------------------------------------------------------------------------
2023-10-25 14:43:01,533 epoch 7 - iter 178/1786 - loss 0.01719752 - time (sec): 8.84 - samples/sec: 2600.24 - lr: 0.000013 - momentum: 0.000000
2023-10-25 14:43:10,125 epoch 7 - iter 356/1786 - loss 0.02135005 - time (sec): 17.43 - samples/sec: 2746.42 - lr: 0.000013 - momentum: 0.000000
2023-10-25 14:43:19,192 epoch 7 - iter 534/1786 - loss 0.02318318 - time (sec): 26.50 - samples/sec: 2854.55 - lr: 0.000012 - momentum: 0.000000
2023-10-25 14:43:28,579 epoch 7 - iter 712/1786 - loss 0.02312462 - time (sec): 35.88 - samples/sec: 2842.16 - lr: 0.000012 - momentum: 0.000000
2023-10-25 14:43:37,817 epoch 7 - iter 890/1786 - loss 0.02250943 - time (sec): 45.12 - samples/sec: 2758.16 - lr: 0.000012 - momentum: 0.000000
2023-10-25 14:43:46,912 epoch 7 - iter 1068/1786 - loss 0.02128871 - time (sec): 54.22 - samples/sec: 2738.00 - lr: 0.000011 - momentum: 0.000000
2023-10-25 14:43:56,071 epoch 7 - iter 1246/1786 - loss 0.02132723 - time (sec): 63.38 - samples/sec: 2735.65 - lr: 0.000011 - momentum: 0.000000
2023-10-25 14:44:05,300 epoch 7 - iter 1424/1786 - loss 0.02185820 - time (sec): 72.61 - samples/sec: 2731.09 - lr: 0.000011 - momentum: 0.000000
2023-10-25 14:44:14,347 epoch 7 - iter 1602/1786 - loss 0.02218104 - time (sec): 81.65 - samples/sec: 2724.54 - lr: 0.000010 - momentum: 0.000000
2023-10-25 14:44:23,204 epoch 7 - iter 1780/1786 - loss 0.02222713 - time (sec): 90.51 - samples/sec: 2740.75 - lr: 0.000010 - momentum: 0.000000
2023-10-25 14:44:23,491 ----------------------------------------------------------------------------------------------------
2023-10-25 14:44:23,491 EPOCH 7 done: loss 0.0223 - lr: 0.000010
2023-10-25 14:44:27,321 DEV : loss 0.22337248921394348 - f1-score (micro avg) 0.7964
2023-10-25 14:44:27,339 saving best model
2023-10-25 14:44:28,025 ----------------------------------------------------------------------------------------------------
2023-10-25 14:44:37,591 epoch 8 - iter 178/1786 - loss 0.01779272 - time (sec): 9.56 - samples/sec: 2588.34 - lr: 0.000010 - momentum: 0.000000
2023-10-25 14:44:47,021 epoch 8 - iter 356/1786 - loss 0.01894142 - time (sec): 18.99 - samples/sec: 2700.71 - lr: 0.000009 - momentum: 0.000000
2023-10-25 14:44:56,629 epoch 8 - iter 534/1786 - loss 0.01605388 - time (sec): 28.60 - samples/sec: 2696.10 - lr: 0.000009 - momentum: 0.000000
2023-10-25 14:45:06,337 epoch 8 - iter 712/1786 - loss 0.01635096 - time (sec): 38.31 - samples/sec: 2688.10 - lr: 0.000009 - momentum: 0.000000
2023-10-25 14:45:15,994 epoch 8 - iter 890/1786 - loss 0.01715962 - time (sec): 47.96 - samples/sec: 2632.09 - lr: 0.000008 - momentum: 0.000000
2023-10-25 14:45:25,422 epoch 8 - iter 1068/1786 - loss 0.01852086 - time (sec): 57.39 - samples/sec: 2616.71 - lr: 0.000008 - momentum: 0.000000
2023-10-25 14:45:34,422 epoch 8 - iter 1246/1786 - loss 0.01764650 - time (sec): 66.39 - samples/sec: 2636.24 - lr: 0.000008 - momentum: 0.000000
2023-10-25 14:45:44,111 epoch 8 - iter 1424/1786 - loss 0.01697131 - time (sec): 76.08 - samples/sec: 2625.83 - lr: 0.000007 - momentum: 0.000000
2023-10-25 14:45:53,686 epoch 8 - iter 1602/1786 - loss 0.01660441 - time (sec): 85.66 - samples/sec: 2623.06 - lr: 0.000007 - momentum: 0.000000
2023-10-25 14:46:03,240 epoch 8 - iter 1780/1786 - loss 0.01626775 - time (sec): 95.21 - samples/sec: 2604.67 - lr: 0.000007 - momentum: 0.000000
2023-10-25 14:46:03,553 ----------------------------------------------------------------------------------------------------
2023-10-25 14:46:03,553 EPOCH 8 done: loss 0.0162 - lr: 0.000007
2023-10-25 14:46:08,504 DEV : loss 0.22934271395206451 - f1-score (micro avg) 0.7997
2023-10-25 14:46:08,526 saving best model
2023-10-25 14:46:09,250 ----------------------------------------------------------------------------------------------------
2023-10-25 14:46:18,809 epoch 9 - iter 178/1786 - loss 0.01295900 - time (sec): 9.56 - samples/sec: 2559.79 - lr: 0.000006 - momentum: 0.000000
2023-10-25 14:46:28,369 epoch 9 - iter 356/1786 - loss 0.01695431 - time (sec): 19.12 - samples/sec: 2630.20 - lr: 0.000006 - momentum: 0.000000
2023-10-25 14:46:37,819 epoch 9 - iter 534/1786 - loss 0.01347445 - time (sec): 28.57 - samples/sec: 2666.73 - lr: 0.000006 - momentum: 0.000000
2023-10-25 14:46:47,103 epoch 9 - iter 712/1786 - loss 0.01503886 - time (sec): 37.85 - samples/sec: 2644.36 - lr: 0.000005 - momentum: 0.000000
2023-10-25 14:46:56,001 epoch 9 - iter 890/1786 - loss 0.01306635 - time (sec): 46.75 - samples/sec: 2693.10 - lr: 0.000005 - momentum: 0.000000
2023-10-25 14:47:05,076 epoch 9 - iter 1068/1786 - loss 0.01204591 - time (sec): 55.82 - samples/sec: 2692.47 - lr: 0.000005 - momentum: 0.000000
2023-10-25 14:47:14,461 epoch 9 - iter 1246/1786 - loss 0.01146321 - time (sec): 65.21 - samples/sec: 2690.78 - lr: 0.000004 - momentum: 0.000000
2023-10-25 14:47:23,721 epoch 9 - iter 1424/1786 - loss 0.01130831 - time (sec): 74.47 - samples/sec: 2677.47 - lr: 0.000004 - momentum: 0.000000
2023-10-25 14:47:33,098 epoch 9 - iter 1602/1786 - loss 0.01112742 - time (sec): 83.85 - samples/sec: 2675.96 - lr: 0.000004 - momentum: 0.000000
2023-10-25 14:47:42,454 epoch 9 - iter 1780/1786 - loss 0.01140774 - time (sec): 93.20 - samples/sec: 2661.17 - lr: 0.000003 - momentum: 0.000000
2023-10-25 14:47:42,751 ----------------------------------------------------------------------------------------------------
2023-10-25 14:47:42,751 EPOCH 9 done: loss 0.0114 - lr: 0.000003
2023-10-25 14:47:46,674 DEV : loss 0.2408183217048645 - f1-score (micro avg) 0.7965
2023-10-25 14:47:46,698 ----------------------------------------------------------------------------------------------------
2023-10-25 14:47:56,117 epoch 10 - iter 178/1786 - loss 0.00749371 - time (sec): 9.42 - samples/sec: 2742.95 - lr: 0.000003 - momentum: 0.000000
2023-10-25 14:48:05,390 epoch 10 - iter 356/1786 - loss 0.00521229 - time (sec): 18.69 - samples/sec: 2747.16 - lr: 0.000003 - momentum: 0.000000
2023-10-25 14:48:14,524 epoch 10 - iter 534/1786 - loss 0.00725197 - time (sec): 27.82 - samples/sec: 2703.18 - lr: 0.000002 - momentum: 0.000000
2023-10-25 14:48:23,669 epoch 10 - iter 712/1786 - loss 0.00808463 - time (sec): 36.97 - samples/sec: 2718.18 - lr: 0.000002 - momentum: 0.000000
2023-10-25 14:48:32,464 epoch 10 - iter 890/1786 - loss 0.00808605 - time (sec): 45.76 - samples/sec: 2711.31 - lr: 0.000002 - momentum: 0.000000
2023-10-25 14:48:41,419 epoch 10 - iter 1068/1786 - loss 0.00815421 - time (sec): 54.72 - samples/sec: 2706.54 - lr: 0.000001 - momentum: 0.000000
2023-10-25 14:48:50,504 epoch 10 - iter 1246/1786 - loss 0.00802142 - time (sec): 63.80 - samples/sec: 2718.32 - lr: 0.000001 - momentum: 0.000000
2023-10-25 14:48:59,792 epoch 10 - iter 1424/1786 - loss 0.00776468 - time (sec): 73.09 - samples/sec: 2721.60 - lr: 0.000001 - momentum: 0.000000
2023-10-25 14:49:09,176 epoch 10 - iter 1602/1786 - loss 0.00740460 - time (sec): 82.48 - samples/sec: 2707.01 - lr: 0.000000 - momentum: 0.000000
2023-10-25 14:49:18,632 epoch 10 - iter 1780/1786 - loss 0.00732665 - time (sec): 91.93 - samples/sec: 2695.35 - lr: 0.000000 - momentum: 0.000000
2023-10-25 14:49:18,972 ----------------------------------------------------------------------------------------------------
2023-10-25 14:49:18,973 EPOCH 10 done: loss 0.0073 - lr: 0.000000
2023-10-25 14:49:24,029 DEV : loss 0.244009330868721 - f1-score (micro avg) 0.7973
2023-10-25 14:49:24,566 ----------------------------------------------------------------------------------------------------
2023-10-25 14:49:24,567 Loading model from best epoch ...
2023-10-25 14:49:26,591 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 14:49:38,665
Results:
- F-score (micro) 0.6933
- F-score (macro) 0.619
- Accuracy 0.5461
By class:
precision recall f1-score support
LOC 0.6972 0.6667 0.6816 1095
PER 0.7832 0.7816 0.7824 1012
ORG 0.4733 0.5714 0.5178 357
HumanProd 0.4038 0.6364 0.4941 33
micro avg 0.6874 0.6992 0.6933 2497
macro avg 0.5894 0.6640 0.6190 2497
weighted avg 0.6962 0.6992 0.6966 2497
2023-10-25 14:49:38,666 ----------------------------------------------------------------------------------------------------