stefan-it's picture
Upload folder using huggingface_hub
ddf6764
2023-10-18 23:50:09,558 ----------------------------------------------------------------------------------------------------
2023-10-18 23:50:09,558 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 23:50:09,558 ----------------------------------------------------------------------------------------------------
2023-10-18 23:50:09,558 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
2023-10-18 23:50:09,558 ----------------------------------------------------------------------------------------------------
2023-10-18 23:50:09,558 Train: 14465 sentences
2023-10-18 23:50:09,558 (train_with_dev=False, train_with_test=False)
2023-10-18 23:50:09,558 ----------------------------------------------------------------------------------------------------
2023-10-18 23:50:09,558 Training Params:
2023-10-18 23:50:09,558 - learning_rate: "5e-05"
2023-10-18 23:50:09,559 - mini_batch_size: "8"
2023-10-18 23:50:09,559 - max_epochs: "10"
2023-10-18 23:50:09,559 - shuffle: "True"
2023-10-18 23:50:09,559 ----------------------------------------------------------------------------------------------------
2023-10-18 23:50:09,559 Plugins:
2023-10-18 23:50:09,559 - TensorboardLogger
2023-10-18 23:50:09,559 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 23:50:09,559 ----------------------------------------------------------------------------------------------------
2023-10-18 23:50:09,559 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 23:50:09,559 - metric: "('micro avg', 'f1-score')"
2023-10-18 23:50:09,559 ----------------------------------------------------------------------------------------------------
2023-10-18 23:50:09,559 Computation:
2023-10-18 23:50:09,559 - compute on device: cuda:0
2023-10-18 23:50:09,559 - embedding storage: none
2023-10-18 23:50:09,559 ----------------------------------------------------------------------------------------------------
2023-10-18 23:50:09,559 Model training base path: "hmbench-letemps/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-18 23:50:09,559 ----------------------------------------------------------------------------------------------------
2023-10-18 23:50:09,559 ----------------------------------------------------------------------------------------------------
2023-10-18 23:50:09,559 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 23:50:13,652 epoch 1 - iter 180/1809 - loss 3.08996031 - time (sec): 4.09 - samples/sec: 9215.23 - lr: 0.000005 - momentum: 0.000000
2023-10-18 23:50:17,785 epoch 1 - iter 360/1809 - loss 2.41550781 - time (sec): 8.23 - samples/sec: 9242.53 - lr: 0.000010 - momentum: 0.000000
2023-10-18 23:50:22,005 epoch 1 - iter 540/1809 - loss 1.76237783 - time (sec): 12.45 - samples/sec: 9280.74 - lr: 0.000015 - momentum: 0.000000
2023-10-18 23:50:25,998 epoch 1 - iter 720/1809 - loss 1.40258020 - time (sec): 16.44 - samples/sec: 9357.12 - lr: 0.000020 - momentum: 0.000000
2023-10-18 23:50:29,969 epoch 1 - iter 900/1809 - loss 1.18204419 - time (sec): 20.41 - samples/sec: 9436.48 - lr: 0.000025 - momentum: 0.000000
2023-10-18 23:50:34,138 epoch 1 - iter 1080/1809 - loss 1.04359449 - time (sec): 24.58 - samples/sec: 9313.63 - lr: 0.000030 - momentum: 0.000000
2023-10-18 23:50:38,296 epoch 1 - iter 1260/1809 - loss 0.93320987 - time (sec): 28.74 - samples/sec: 9251.00 - lr: 0.000035 - momentum: 0.000000
2023-10-18 23:50:42,424 epoch 1 - iter 1440/1809 - loss 0.84898279 - time (sec): 32.86 - samples/sec: 9201.97 - lr: 0.000040 - momentum: 0.000000
2023-10-18 23:50:46,577 epoch 1 - iter 1620/1809 - loss 0.77865271 - time (sec): 37.02 - samples/sec: 9161.07 - lr: 0.000045 - momentum: 0.000000
2023-10-18 23:50:50,699 epoch 1 - iter 1800/1809 - loss 0.71875392 - time (sec): 41.14 - samples/sec: 9194.58 - lr: 0.000050 - momentum: 0.000000
2023-10-18 23:50:50,904 ----------------------------------------------------------------------------------------------------
2023-10-18 23:50:50,905 EPOCH 1 done: loss 0.7161 - lr: 0.000050
2023-10-18 23:50:53,255 DEV : loss 0.17210480570793152 - f1-score (micro avg) 0.2591
2023-10-18 23:50:53,282 saving best model
2023-10-18 23:50:53,312 ----------------------------------------------------------------------------------------------------
2023-10-18 23:50:57,438 epoch 2 - iter 180/1809 - loss 0.19350008 - time (sec): 4.13 - samples/sec: 9119.45 - lr: 0.000049 - momentum: 0.000000
2023-10-18 23:51:01,657 epoch 2 - iter 360/1809 - loss 0.18530881 - time (sec): 8.34 - samples/sec: 9136.68 - lr: 0.000049 - momentum: 0.000000
2023-10-18 23:51:05,850 epoch 2 - iter 540/1809 - loss 0.18831855 - time (sec): 12.54 - samples/sec: 9025.27 - lr: 0.000048 - momentum: 0.000000
2023-10-18 23:51:10,021 epoch 2 - iter 720/1809 - loss 0.18353362 - time (sec): 16.71 - samples/sec: 9010.09 - lr: 0.000048 - momentum: 0.000000
2023-10-18 23:51:13,824 epoch 2 - iter 900/1809 - loss 0.18236110 - time (sec): 20.51 - samples/sec: 9211.87 - lr: 0.000047 - momentum: 0.000000
2023-10-18 23:51:17,983 epoch 2 - iter 1080/1809 - loss 0.17912246 - time (sec): 24.67 - samples/sec: 9237.95 - lr: 0.000047 - momentum: 0.000000
2023-10-18 23:51:22,214 epoch 2 - iter 1260/1809 - loss 0.17835039 - time (sec): 28.90 - samples/sec: 9189.83 - lr: 0.000046 - momentum: 0.000000
2023-10-18 23:51:26,461 epoch 2 - iter 1440/1809 - loss 0.17614405 - time (sec): 33.15 - samples/sec: 9124.36 - lr: 0.000046 - momentum: 0.000000
2023-10-18 23:51:30,784 epoch 2 - iter 1620/1809 - loss 0.17446410 - time (sec): 37.47 - samples/sec: 9087.49 - lr: 0.000045 - momentum: 0.000000
2023-10-18 23:51:35,022 epoch 2 - iter 1800/1809 - loss 0.17193393 - time (sec): 41.71 - samples/sec: 9063.26 - lr: 0.000044 - momentum: 0.000000
2023-10-18 23:51:35,223 ----------------------------------------------------------------------------------------------------
2023-10-18 23:51:35,223 EPOCH 2 done: loss 0.1718 - lr: 0.000044
2023-10-18 23:51:39,047 DEV : loss 0.15255987644195557 - f1-score (micro avg) 0.3569
2023-10-18 23:51:39,075 saving best model
2023-10-18 23:51:39,114 ----------------------------------------------------------------------------------------------------
2023-10-18 23:51:43,333 epoch 3 - iter 180/1809 - loss 0.17150139 - time (sec): 4.22 - samples/sec: 8967.83 - lr: 0.000044 - momentum: 0.000000
2023-10-18 23:51:47,648 epoch 3 - iter 360/1809 - loss 0.16038839 - time (sec): 8.53 - samples/sec: 8745.93 - lr: 0.000043 - momentum: 0.000000
2023-10-18 23:51:51,903 epoch 3 - iter 540/1809 - loss 0.15736877 - time (sec): 12.79 - samples/sec: 8943.81 - lr: 0.000043 - momentum: 0.000000
2023-10-18 23:51:56,136 epoch 3 - iter 720/1809 - loss 0.15451625 - time (sec): 17.02 - samples/sec: 8899.58 - lr: 0.000042 - momentum: 0.000000
2023-10-18 23:52:00,284 epoch 3 - iter 900/1809 - loss 0.14915798 - time (sec): 21.17 - samples/sec: 9008.40 - lr: 0.000042 - momentum: 0.000000
2023-10-18 23:52:04,655 epoch 3 - iter 1080/1809 - loss 0.14517457 - time (sec): 25.54 - samples/sec: 8988.62 - lr: 0.000041 - momentum: 0.000000
2023-10-18 23:52:08,944 epoch 3 - iter 1260/1809 - loss 0.14398672 - time (sec): 29.83 - samples/sec: 8930.99 - lr: 0.000041 - momentum: 0.000000
2023-10-18 23:52:13,152 epoch 3 - iter 1440/1809 - loss 0.14423275 - time (sec): 34.04 - samples/sec: 8931.25 - lr: 0.000040 - momentum: 0.000000
2023-10-18 23:52:17,308 epoch 3 - iter 1620/1809 - loss 0.14312020 - time (sec): 38.19 - samples/sec: 8940.45 - lr: 0.000039 - momentum: 0.000000
2023-10-18 23:52:21,455 epoch 3 - iter 1800/1809 - loss 0.14319625 - time (sec): 42.34 - samples/sec: 8935.40 - lr: 0.000039 - momentum: 0.000000
2023-10-18 23:52:21,639 ----------------------------------------------------------------------------------------------------
2023-10-18 23:52:21,640 EPOCH 3 done: loss 0.1434 - lr: 0.000039
2023-10-18 23:52:24,829 DEV : loss 0.15280331671237946 - f1-score (micro avg) 0.4243
2023-10-18 23:52:24,856 saving best model
2023-10-18 23:52:24,888 ----------------------------------------------------------------------------------------------------
2023-10-18 23:52:29,169 epoch 4 - iter 180/1809 - loss 0.14125343 - time (sec): 4.28 - samples/sec: 8441.69 - lr: 0.000038 - momentum: 0.000000
2023-10-18 23:52:33,498 epoch 4 - iter 360/1809 - loss 0.13670460 - time (sec): 8.61 - samples/sec: 8764.35 - lr: 0.000038 - momentum: 0.000000
2023-10-18 23:52:37,835 epoch 4 - iter 540/1809 - loss 0.13288048 - time (sec): 12.95 - samples/sec: 8765.16 - lr: 0.000037 - momentum: 0.000000
2023-10-18 23:52:42,046 epoch 4 - iter 720/1809 - loss 0.13346511 - time (sec): 17.16 - samples/sec: 8749.72 - lr: 0.000037 - momentum: 0.000000
2023-10-18 23:52:46,264 epoch 4 - iter 900/1809 - loss 0.13052521 - time (sec): 21.38 - samples/sec: 8780.95 - lr: 0.000036 - momentum: 0.000000
2023-10-18 23:52:50,533 epoch 4 - iter 1080/1809 - loss 0.12782764 - time (sec): 25.64 - samples/sec: 8824.36 - lr: 0.000036 - momentum: 0.000000
2023-10-18 23:52:54,871 epoch 4 - iter 1260/1809 - loss 0.12904407 - time (sec): 29.98 - samples/sec: 8857.53 - lr: 0.000035 - momentum: 0.000000
2023-10-18 23:52:59,036 epoch 4 - iter 1440/1809 - loss 0.13011623 - time (sec): 34.15 - samples/sec: 8855.12 - lr: 0.000034 - momentum: 0.000000
2023-10-18 23:53:03,358 epoch 4 - iter 1620/1809 - loss 0.12871829 - time (sec): 38.47 - samples/sec: 8866.49 - lr: 0.000034 - momentum: 0.000000
2023-10-18 23:53:07,567 epoch 4 - iter 1800/1809 - loss 0.12782507 - time (sec): 42.68 - samples/sec: 8855.25 - lr: 0.000033 - momentum: 0.000000
2023-10-18 23:53:07,776 ----------------------------------------------------------------------------------------------------
2023-10-18 23:53:07,776 EPOCH 4 done: loss 0.1277 - lr: 0.000033
2023-10-18 23:53:11,633 DEV : loss 0.15063098073005676 - f1-score (micro avg) 0.4476
2023-10-18 23:53:11,661 saving best model
2023-10-18 23:53:11,695 ----------------------------------------------------------------------------------------------------
2023-10-18 23:53:15,995 epoch 5 - iter 180/1809 - loss 0.11386799 - time (sec): 4.30 - samples/sec: 9079.20 - lr: 0.000033 - momentum: 0.000000
2023-10-18 23:53:20,165 epoch 5 - iter 360/1809 - loss 0.12060812 - time (sec): 8.47 - samples/sec: 9140.56 - lr: 0.000032 - momentum: 0.000000
2023-10-18 23:53:24,510 epoch 5 - iter 540/1809 - loss 0.11439752 - time (sec): 12.81 - samples/sec: 9043.00 - lr: 0.000032 - momentum: 0.000000
2023-10-18 23:53:28,764 epoch 5 - iter 720/1809 - loss 0.11352025 - time (sec): 17.07 - samples/sec: 9017.96 - lr: 0.000031 - momentum: 0.000000
2023-10-18 23:53:33,019 epoch 5 - iter 900/1809 - loss 0.11380639 - time (sec): 21.32 - samples/sec: 8942.74 - lr: 0.000031 - momentum: 0.000000
2023-10-18 23:53:37,217 epoch 5 - iter 1080/1809 - loss 0.11497194 - time (sec): 25.52 - samples/sec: 8905.41 - lr: 0.000030 - momentum: 0.000000
2023-10-18 23:53:41,448 epoch 5 - iter 1260/1809 - loss 0.11425627 - time (sec): 29.75 - samples/sec: 8930.11 - lr: 0.000029 - momentum: 0.000000
2023-10-18 23:53:45,607 epoch 5 - iter 1440/1809 - loss 0.11301762 - time (sec): 33.91 - samples/sec: 8951.27 - lr: 0.000029 - momentum: 0.000000
2023-10-18 23:53:49,741 epoch 5 - iter 1620/1809 - loss 0.11251591 - time (sec): 38.05 - samples/sec: 8960.67 - lr: 0.000028 - momentum: 0.000000
2023-10-18 23:53:54,016 epoch 5 - iter 1800/1809 - loss 0.11238350 - time (sec): 42.32 - samples/sec: 8932.57 - lr: 0.000028 - momentum: 0.000000
2023-10-18 23:53:54,227 ----------------------------------------------------------------------------------------------------
2023-10-18 23:53:54,227 EPOCH 5 done: loss 0.1123 - lr: 0.000028
2023-10-18 23:53:57,446 DEV : loss 0.1514587551355362 - f1-score (micro avg) 0.4644
2023-10-18 23:53:57,474 saving best model
2023-10-18 23:53:57,514 ----------------------------------------------------------------------------------------------------
2023-10-18 23:54:01,829 epoch 6 - iter 180/1809 - loss 0.11537357 - time (sec): 4.32 - samples/sec: 9104.98 - lr: 0.000027 - momentum: 0.000000
2023-10-18 23:54:06,005 epoch 6 - iter 360/1809 - loss 0.11309621 - time (sec): 8.49 - samples/sec: 8967.79 - lr: 0.000027 - momentum: 0.000000
2023-10-18 23:54:10,175 epoch 6 - iter 540/1809 - loss 0.11364534 - time (sec): 12.66 - samples/sec: 8824.88 - lr: 0.000026 - momentum: 0.000000
2023-10-18 23:54:14,470 epoch 6 - iter 720/1809 - loss 0.11100439 - time (sec): 16.96 - samples/sec: 8875.97 - lr: 0.000026 - momentum: 0.000000
2023-10-18 23:54:18,822 epoch 6 - iter 900/1809 - loss 0.10958062 - time (sec): 21.31 - samples/sec: 8821.53 - lr: 0.000025 - momentum: 0.000000
2023-10-18 23:54:23,117 epoch 6 - iter 1080/1809 - loss 0.10709390 - time (sec): 25.60 - samples/sec: 8851.08 - lr: 0.000024 - momentum: 0.000000
2023-10-18 23:54:27,381 epoch 6 - iter 1260/1809 - loss 0.10377290 - time (sec): 29.87 - samples/sec: 8834.76 - lr: 0.000024 - momentum: 0.000000
2023-10-18 23:54:31,567 epoch 6 - iter 1440/1809 - loss 0.10361018 - time (sec): 34.05 - samples/sec: 8874.50 - lr: 0.000023 - momentum: 0.000000
2023-10-18 23:54:35,589 epoch 6 - iter 1620/1809 - loss 0.10410030 - time (sec): 38.07 - samples/sec: 8948.10 - lr: 0.000023 - momentum: 0.000000
2023-10-18 23:54:39,866 epoch 6 - iter 1800/1809 - loss 0.10567918 - time (sec): 42.35 - samples/sec: 8930.20 - lr: 0.000022 - momentum: 0.000000
2023-10-18 23:54:40,070 ----------------------------------------------------------------------------------------------------
2023-10-18 23:54:40,070 EPOCH 6 done: loss 0.1056 - lr: 0.000022
2023-10-18 23:54:43,949 DEV : loss 0.1575939804315567 - f1-score (micro avg) 0.4871
2023-10-18 23:54:43,977 saving best model
2023-10-18 23:54:44,013 ----------------------------------------------------------------------------------------------------
2023-10-18 23:54:48,217 epoch 7 - iter 180/1809 - loss 0.10172919 - time (sec): 4.20 - samples/sec: 9184.15 - lr: 0.000022 - momentum: 0.000000
2023-10-18 23:54:52,501 epoch 7 - iter 360/1809 - loss 0.10051170 - time (sec): 8.49 - samples/sec: 9162.68 - lr: 0.000021 - momentum: 0.000000
2023-10-18 23:54:56,661 epoch 7 - iter 540/1809 - loss 0.10165841 - time (sec): 12.65 - samples/sec: 9083.25 - lr: 0.000021 - momentum: 0.000000
2023-10-18 23:55:00,929 epoch 7 - iter 720/1809 - loss 0.10132993 - time (sec): 16.92 - samples/sec: 9003.67 - lr: 0.000020 - momentum: 0.000000
2023-10-18 23:55:05,261 epoch 7 - iter 900/1809 - loss 0.10000428 - time (sec): 21.25 - samples/sec: 8962.92 - lr: 0.000019 - momentum: 0.000000
2023-10-18 23:55:09,564 epoch 7 - iter 1080/1809 - loss 0.09762103 - time (sec): 25.55 - samples/sec: 8964.75 - lr: 0.000019 - momentum: 0.000000
2023-10-18 23:55:13,714 epoch 7 - iter 1260/1809 - loss 0.09765739 - time (sec): 29.70 - samples/sec: 8975.24 - lr: 0.000018 - momentum: 0.000000
2023-10-18 23:55:17,857 epoch 7 - iter 1440/1809 - loss 0.09694399 - time (sec): 33.84 - samples/sec: 8975.87 - lr: 0.000018 - momentum: 0.000000
2023-10-18 23:55:22,063 epoch 7 - iter 1620/1809 - loss 0.09829466 - time (sec): 38.05 - samples/sec: 8972.65 - lr: 0.000017 - momentum: 0.000000
2023-10-18 23:55:26,296 epoch 7 - iter 1800/1809 - loss 0.09838081 - time (sec): 42.28 - samples/sec: 8954.08 - lr: 0.000017 - momentum: 0.000000
2023-10-18 23:55:26,493 ----------------------------------------------------------------------------------------------------
2023-10-18 23:55:26,493 EPOCH 7 done: loss 0.0984 - lr: 0.000017
2023-10-18 23:55:29,683 DEV : loss 0.16171115636825562 - f1-score (micro avg) 0.5
2023-10-18 23:55:29,711 saving best model
2023-10-18 23:55:29,744 ----------------------------------------------------------------------------------------------------
2023-10-18 23:55:33,953 epoch 8 - iter 180/1809 - loss 0.08931629 - time (sec): 4.21 - samples/sec: 8929.32 - lr: 0.000016 - momentum: 0.000000
2023-10-18 23:55:38,202 epoch 8 - iter 360/1809 - loss 0.08518400 - time (sec): 8.46 - samples/sec: 8894.96 - lr: 0.000016 - momentum: 0.000000
2023-10-18 23:55:42,469 epoch 8 - iter 540/1809 - loss 0.08586313 - time (sec): 12.72 - samples/sec: 8976.80 - lr: 0.000015 - momentum: 0.000000
2023-10-18 23:55:46,803 epoch 8 - iter 720/1809 - loss 0.08639375 - time (sec): 17.06 - samples/sec: 8883.44 - lr: 0.000014 - momentum: 0.000000
2023-10-18 23:55:51,066 epoch 8 - iter 900/1809 - loss 0.08971534 - time (sec): 21.32 - samples/sec: 8920.93 - lr: 0.000014 - momentum: 0.000000
2023-10-18 23:55:55,425 epoch 8 - iter 1080/1809 - loss 0.09223836 - time (sec): 25.68 - samples/sec: 8906.09 - lr: 0.000013 - momentum: 0.000000
2023-10-18 23:55:59,663 epoch 8 - iter 1260/1809 - loss 0.09291100 - time (sec): 29.92 - samples/sec: 8930.10 - lr: 0.000013 - momentum: 0.000000
2023-10-18 23:56:03,864 epoch 8 - iter 1440/1809 - loss 0.09324636 - time (sec): 34.12 - samples/sec: 8935.33 - lr: 0.000012 - momentum: 0.000000
2023-10-18 23:56:08,013 epoch 8 - iter 1620/1809 - loss 0.09268125 - time (sec): 38.27 - samples/sec: 8950.24 - lr: 0.000012 - momentum: 0.000000
2023-10-18 23:56:12,070 epoch 8 - iter 1800/1809 - loss 0.09179737 - time (sec): 42.33 - samples/sec: 8936.09 - lr: 0.000011 - momentum: 0.000000
2023-10-18 23:56:12,234 ----------------------------------------------------------------------------------------------------
2023-10-18 23:56:12,234 EPOCH 8 done: loss 0.0916 - lr: 0.000011
2023-10-18 23:56:16,143 DEV : loss 0.16610997915267944 - f1-score (micro avg) 0.5027
2023-10-18 23:56:16,171 saving best model
2023-10-18 23:56:16,209 ----------------------------------------------------------------------------------------------------
2023-10-18 23:56:20,393 epoch 9 - iter 180/1809 - loss 0.08056805 - time (sec): 4.18 - samples/sec: 9199.47 - lr: 0.000011 - momentum: 0.000000
2023-10-18 23:56:24,613 epoch 9 - iter 360/1809 - loss 0.08449942 - time (sec): 8.40 - samples/sec: 9098.24 - lr: 0.000010 - momentum: 0.000000
2023-10-18 23:56:28,795 epoch 9 - iter 540/1809 - loss 0.08566893 - time (sec): 12.59 - samples/sec: 9061.30 - lr: 0.000009 - momentum: 0.000000
2023-10-18 23:56:32,952 epoch 9 - iter 720/1809 - loss 0.08869739 - time (sec): 16.74 - samples/sec: 9158.64 - lr: 0.000009 - momentum: 0.000000
2023-10-18 23:56:37,078 epoch 9 - iter 900/1809 - loss 0.08905400 - time (sec): 20.87 - samples/sec: 9052.39 - lr: 0.000008 - momentum: 0.000000
2023-10-18 23:56:41,233 epoch 9 - iter 1080/1809 - loss 0.08725953 - time (sec): 25.02 - samples/sec: 9055.30 - lr: 0.000008 - momentum: 0.000000
2023-10-18 23:56:45,461 epoch 9 - iter 1260/1809 - loss 0.08791601 - time (sec): 29.25 - samples/sec: 9071.52 - lr: 0.000007 - momentum: 0.000000
2023-10-18 23:56:49,729 epoch 9 - iter 1440/1809 - loss 0.08850259 - time (sec): 33.52 - samples/sec: 9049.58 - lr: 0.000007 - momentum: 0.000000
2023-10-18 23:56:53,950 epoch 9 - iter 1620/1809 - loss 0.08846222 - time (sec): 37.74 - samples/sec: 9038.44 - lr: 0.000006 - momentum: 0.000000
2023-10-18 23:56:58,114 epoch 9 - iter 1800/1809 - loss 0.08900226 - time (sec): 41.90 - samples/sec: 9024.65 - lr: 0.000006 - momentum: 0.000000
2023-10-18 23:56:58,333 ----------------------------------------------------------------------------------------------------
2023-10-18 23:56:58,334 EPOCH 9 done: loss 0.0893 - lr: 0.000006
2023-10-18 23:57:01,548 DEV : loss 0.17448826134204865 - f1-score (micro avg) 0.4978
2023-10-18 23:57:01,576 ----------------------------------------------------------------------------------------------------
2023-10-18 23:57:05,851 epoch 10 - iter 180/1809 - loss 0.09229581 - time (sec): 4.27 - samples/sec: 8613.74 - lr: 0.000005 - momentum: 0.000000
2023-10-18 23:57:10,005 epoch 10 - iter 360/1809 - loss 0.08443788 - time (sec): 8.43 - samples/sec: 8795.85 - lr: 0.000004 - momentum: 0.000000
2023-10-18 23:57:14,339 epoch 10 - iter 540/1809 - loss 0.08175246 - time (sec): 12.76 - samples/sec: 8736.35 - lr: 0.000004 - momentum: 0.000000
2023-10-18 23:57:18,553 epoch 10 - iter 720/1809 - loss 0.08413821 - time (sec): 16.98 - samples/sec: 8780.50 - lr: 0.000003 - momentum: 0.000000
2023-10-18 23:57:22,844 epoch 10 - iter 900/1809 - loss 0.08309562 - time (sec): 21.27 - samples/sec: 8807.69 - lr: 0.000003 - momentum: 0.000000
2023-10-18 23:57:27,042 epoch 10 - iter 1080/1809 - loss 0.08624691 - time (sec): 25.47 - samples/sec: 8916.19 - lr: 0.000002 - momentum: 0.000000
2023-10-18 23:57:31,177 epoch 10 - iter 1260/1809 - loss 0.08479139 - time (sec): 29.60 - samples/sec: 8926.24 - lr: 0.000002 - momentum: 0.000000
2023-10-18 23:57:35,461 epoch 10 - iter 1440/1809 - loss 0.08563822 - time (sec): 33.88 - samples/sec: 8907.81 - lr: 0.000001 - momentum: 0.000000
2023-10-18 23:57:39,697 epoch 10 - iter 1620/1809 - loss 0.08500765 - time (sec): 38.12 - samples/sec: 8960.52 - lr: 0.000001 - momentum: 0.000000
2023-10-18 23:57:43,843 epoch 10 - iter 1800/1809 - loss 0.08584855 - time (sec): 42.27 - samples/sec: 8952.52 - lr: 0.000000 - momentum: 0.000000
2023-10-18 23:57:44,033 ----------------------------------------------------------------------------------------------------
2023-10-18 23:57:44,033 EPOCH 10 done: loss 0.0862 - lr: 0.000000
2023-10-18 23:57:47,913 DEV : loss 0.17676953971385956 - f1-score (micro avg) 0.5019
2023-10-18 23:57:47,972 ----------------------------------------------------------------------------------------------------
2023-10-18 23:57:47,973 Loading model from best epoch ...
2023-10-18 23:57:48,060 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
2023-10-18 23:57:51,479
Results:
- F-score (micro) 0.5103
- F-score (macro) 0.3397
- Accuracy 0.3558
By class:
precision recall f1-score support
loc 0.5186 0.6599 0.5808 591
pers 0.4167 0.4622 0.4382 357
org 0.0000 0.0000 0.0000 79
micro avg 0.4834 0.5404 0.5103 1027
macro avg 0.3118 0.3740 0.3397 1027
weighted avg 0.4433 0.5404 0.4866 1027
2023-10-18 23:57:51,479 ----------------------------------------------------------------------------------------------------