stefan-it's picture
Upload folder using huggingface_hub
5e7281e
2023-10-18 18:42:19,552 ----------------------------------------------------------------------------------------------------
2023-10-18 18:42:19,552 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 18:42:19,553 ----------------------------------------------------------------------------------------------------
2023-10-18 18:42:19,553 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-18 18:42:19,553 ----------------------------------------------------------------------------------------------------
2023-10-18 18:42:19,553 Train: 5901 sentences
2023-10-18 18:42:19,553 (train_with_dev=False, train_with_test=False)
2023-10-18 18:42:19,553 ----------------------------------------------------------------------------------------------------
2023-10-18 18:42:19,553 Training Params:
2023-10-18 18:42:19,553 - learning_rate: "3e-05"
2023-10-18 18:42:19,553 - mini_batch_size: "4"
2023-10-18 18:42:19,553 - max_epochs: "10"
2023-10-18 18:42:19,553 - shuffle: "True"
2023-10-18 18:42:19,553 ----------------------------------------------------------------------------------------------------
2023-10-18 18:42:19,553 Plugins:
2023-10-18 18:42:19,553 - TensorboardLogger
2023-10-18 18:42:19,553 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 18:42:19,553 ----------------------------------------------------------------------------------------------------
2023-10-18 18:42:19,553 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 18:42:19,553 - metric: "('micro avg', 'f1-score')"
2023-10-18 18:42:19,553 ----------------------------------------------------------------------------------------------------
2023-10-18 18:42:19,553 Computation:
2023-10-18 18:42:19,553 - compute on device: cuda:0
2023-10-18 18:42:19,553 - embedding storage: none
2023-10-18 18:42:19,553 ----------------------------------------------------------------------------------------------------
2023-10-18 18:42:19,553 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-18 18:42:19,553 ----------------------------------------------------------------------------------------------------
2023-10-18 18:42:19,553 ----------------------------------------------------------------------------------------------------
2023-10-18 18:42:19,554 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 18:42:22,483 epoch 1 - iter 147/1476 - loss 3.78556163 - time (sec): 2.93 - samples/sec: 5682.81 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:42:24,846 epoch 1 - iter 294/1476 - loss 3.49556145 - time (sec): 5.29 - samples/sec: 6046.59 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:42:27,226 epoch 1 - iter 441/1476 - loss 3.01465804 - time (sec): 7.67 - samples/sec: 6603.04 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:42:29,575 epoch 1 - iter 588/1476 - loss 2.53729222 - time (sec): 10.02 - samples/sec: 6748.99 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:42:31,898 epoch 1 - iter 735/1476 - loss 2.20001287 - time (sec): 12.34 - samples/sec: 6749.73 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:42:34,215 epoch 1 - iter 882/1476 - loss 1.96299069 - time (sec): 14.66 - samples/sec: 6740.22 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:42:36,563 epoch 1 - iter 1029/1476 - loss 1.76317650 - time (sec): 17.01 - samples/sec: 6825.64 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:42:38,861 epoch 1 - iter 1176/1476 - loss 1.61885567 - time (sec): 19.31 - samples/sec: 6848.58 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:42:41,179 epoch 1 - iter 1323/1476 - loss 1.50834098 - time (sec): 21.63 - samples/sec: 6821.58 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:42:43,583 epoch 1 - iter 1470/1476 - loss 1.39708712 - time (sec): 24.03 - samples/sec: 6903.66 - lr: 0.000030 - momentum: 0.000000
2023-10-18 18:42:43,811 ----------------------------------------------------------------------------------------------------
2023-10-18 18:42:43,811 EPOCH 1 done: loss 1.3940 - lr: 0.000030
2023-10-18 18:42:46,113 DEV : loss 0.41898399591445923 - f1-score (micro avg) 0.0443
2023-10-18 18:42:46,137 saving best model
2023-10-18 18:42:46,171 ----------------------------------------------------------------------------------------------------
2023-10-18 18:42:48,566 epoch 2 - iter 147/1476 - loss 0.47548170 - time (sec): 2.39 - samples/sec: 6840.57 - lr: 0.000030 - momentum: 0.000000
2023-10-18 18:42:51,002 epoch 2 - iter 294/1476 - loss 0.48872686 - time (sec): 4.83 - samples/sec: 7549.30 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:42:53,397 epoch 2 - iter 441/1476 - loss 0.47995919 - time (sec): 7.23 - samples/sec: 7321.02 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:42:55,762 epoch 2 - iter 588/1476 - loss 0.47927875 - time (sec): 9.59 - samples/sec: 7245.34 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:42:58,037 epoch 2 - iter 735/1476 - loss 0.47349392 - time (sec): 11.87 - samples/sec: 7110.07 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:43:00,393 epoch 2 - iter 882/1476 - loss 0.47046838 - time (sec): 14.22 - samples/sec: 7145.73 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:43:02,764 epoch 2 - iter 1029/1476 - loss 0.45588573 - time (sec): 16.59 - samples/sec: 7206.55 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:43:05,048 epoch 2 - iter 1176/1476 - loss 0.45406832 - time (sec): 18.88 - samples/sec: 7118.13 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:43:07,287 epoch 2 - iter 1323/1476 - loss 0.44810182 - time (sec): 21.12 - samples/sec: 7086.86 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:43:09,615 epoch 2 - iter 1470/1476 - loss 0.44252694 - time (sec): 23.44 - samples/sec: 7074.16 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:43:09,700 ----------------------------------------------------------------------------------------------------
2023-10-18 18:43:09,700 EPOCH 2 done: loss 0.4425 - lr: 0.000027
2023-10-18 18:43:17,042 DEV : loss 0.31516316533088684 - f1-score (micro avg) 0.2966
2023-10-18 18:43:17,065 saving best model
2023-10-18 18:43:17,099 ----------------------------------------------------------------------------------------------------
2023-10-18 18:43:19,471 epoch 3 - iter 147/1476 - loss 0.37653743 - time (sec): 2.37 - samples/sec: 7173.15 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:43:21,856 epoch 3 - iter 294/1476 - loss 0.38356178 - time (sec): 4.76 - samples/sec: 7220.20 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:43:24,194 epoch 3 - iter 441/1476 - loss 0.38400063 - time (sec): 7.09 - samples/sec: 7109.22 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:43:26,526 epoch 3 - iter 588/1476 - loss 0.37943481 - time (sec): 9.43 - samples/sec: 7064.00 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:43:28,778 epoch 3 - iter 735/1476 - loss 0.38111007 - time (sec): 11.68 - samples/sec: 6886.94 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:43:31,090 epoch 3 - iter 882/1476 - loss 0.37841769 - time (sec): 13.99 - samples/sec: 6911.81 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:43:33,464 epoch 3 - iter 1029/1476 - loss 0.36828271 - time (sec): 16.36 - samples/sec: 6962.24 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:43:35,777 epoch 3 - iter 1176/1476 - loss 0.37002507 - time (sec): 18.68 - samples/sec: 7063.49 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:43:38,053 epoch 3 - iter 1323/1476 - loss 0.37070190 - time (sec): 20.95 - samples/sec: 7070.45 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:43:40,424 epoch 3 - iter 1470/1476 - loss 0.37243905 - time (sec): 23.32 - samples/sec: 7107.03 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:43:40,511 ----------------------------------------------------------------------------------------------------
2023-10-18 18:43:40,511 EPOCH 3 done: loss 0.3731 - lr: 0.000023
2023-10-18 18:43:47,546 DEV : loss 0.28877052664756775 - f1-score (micro avg) 0.3558
2023-10-18 18:43:47,570 saving best model
2023-10-18 18:43:47,607 ----------------------------------------------------------------------------------------------------
2023-10-18 18:43:50,005 epoch 4 - iter 147/1476 - loss 0.34367342 - time (sec): 2.40 - samples/sec: 7349.88 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:43:52,283 epoch 4 - iter 294/1476 - loss 0.32751733 - time (sec): 4.68 - samples/sec: 7008.91 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:43:54,604 epoch 4 - iter 441/1476 - loss 0.33681949 - time (sec): 7.00 - samples/sec: 6977.62 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:43:56,781 epoch 4 - iter 588/1476 - loss 0.33417208 - time (sec): 9.17 - samples/sec: 7228.72 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:43:58,893 epoch 4 - iter 735/1476 - loss 0.32979133 - time (sec): 11.29 - samples/sec: 7480.23 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:44:01,204 epoch 4 - iter 882/1476 - loss 0.33930172 - time (sec): 13.60 - samples/sec: 7433.45 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:44:03,533 epoch 4 - iter 1029/1476 - loss 0.33934197 - time (sec): 15.93 - samples/sec: 7378.33 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:44:05,902 epoch 4 - iter 1176/1476 - loss 0.33984314 - time (sec): 18.29 - samples/sec: 7307.51 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:44:08,208 epoch 4 - iter 1323/1476 - loss 0.33833182 - time (sec): 20.60 - samples/sec: 7277.07 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:44:10,567 epoch 4 - iter 1470/1476 - loss 0.33694420 - time (sec): 22.96 - samples/sec: 7221.17 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:44:10,651 ----------------------------------------------------------------------------------------------------
2023-10-18 18:44:10,651 EPOCH 4 done: loss 0.3368 - lr: 0.000020
2023-10-18 18:44:17,690 DEV : loss 0.28429877758026123 - f1-score (micro avg) 0.3935
2023-10-18 18:44:17,715 saving best model
2023-10-18 18:44:17,749 ----------------------------------------------------------------------------------------------------
2023-10-18 18:44:20,063 epoch 5 - iter 147/1476 - loss 0.32343183 - time (sec): 2.31 - samples/sec: 7010.69 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:44:22,406 epoch 5 - iter 294/1476 - loss 0.31723666 - time (sec): 4.66 - samples/sec: 6933.10 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:44:24,747 epoch 5 - iter 441/1476 - loss 0.31507297 - time (sec): 7.00 - samples/sec: 6946.03 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:44:27,237 epoch 5 - iter 588/1476 - loss 0.31066895 - time (sec): 9.49 - samples/sec: 7185.66 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:44:29,596 epoch 5 - iter 735/1476 - loss 0.31208579 - time (sec): 11.85 - samples/sec: 7206.13 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:44:32,023 epoch 5 - iter 882/1476 - loss 0.31223448 - time (sec): 14.27 - samples/sec: 7080.60 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:44:34,374 epoch 5 - iter 1029/1476 - loss 0.31346772 - time (sec): 16.62 - samples/sec: 6926.93 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:44:36,703 epoch 5 - iter 1176/1476 - loss 0.31351190 - time (sec): 18.95 - samples/sec: 6893.34 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:44:39,107 epoch 5 - iter 1323/1476 - loss 0.31194294 - time (sec): 21.36 - samples/sec: 6974.05 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:44:41,419 epoch 5 - iter 1470/1476 - loss 0.31206302 - time (sec): 23.67 - samples/sec: 7004.06 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:44:41,510 ----------------------------------------------------------------------------------------------------
2023-10-18 18:44:41,511 EPOCH 5 done: loss 0.3121 - lr: 0.000017
2023-10-18 18:44:48,628 DEV : loss 0.26452046632766724 - f1-score (micro avg) 0.4284
2023-10-18 18:44:48,654 saving best model
2023-10-18 18:44:48,691 ----------------------------------------------------------------------------------------------------
2023-10-18 18:44:51,083 epoch 6 - iter 147/1476 - loss 0.30324497 - time (sec): 2.39 - samples/sec: 7532.60 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:44:53,512 epoch 6 - iter 294/1476 - loss 0.28835079 - time (sec): 4.82 - samples/sec: 7249.21 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:44:55,909 epoch 6 - iter 441/1476 - loss 0.28662018 - time (sec): 7.22 - samples/sec: 7110.57 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:44:58,264 epoch 6 - iter 588/1476 - loss 0.29081123 - time (sec): 9.57 - samples/sec: 7068.00 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:45:00,689 epoch 6 - iter 735/1476 - loss 0.28695726 - time (sec): 12.00 - samples/sec: 7051.66 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:45:03,003 epoch 6 - iter 882/1476 - loss 0.29455832 - time (sec): 14.31 - samples/sec: 6931.39 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:45:05,403 epoch 6 - iter 1029/1476 - loss 0.29671075 - time (sec): 16.71 - samples/sec: 6946.77 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:45:07,722 epoch 6 - iter 1176/1476 - loss 0.29458444 - time (sec): 19.03 - samples/sec: 6913.91 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:45:10,046 epoch 6 - iter 1323/1476 - loss 0.29287569 - time (sec): 21.35 - samples/sec: 6903.39 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:45:12,430 epoch 6 - iter 1470/1476 - loss 0.29310511 - time (sec): 23.74 - samples/sec: 6974.93 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:45:12,531 ----------------------------------------------------------------------------------------------------
2023-10-18 18:45:12,531 EPOCH 6 done: loss 0.2924 - lr: 0.000013
2023-10-18 18:45:19,620 DEV : loss 0.25417211651802063 - f1-score (micro avg) 0.4487
2023-10-18 18:45:19,645 saving best model
2023-10-18 18:45:19,689 ----------------------------------------------------------------------------------------------------
2023-10-18 18:45:22,089 epoch 7 - iter 147/1476 - loss 0.26935127 - time (sec): 2.40 - samples/sec: 6983.45 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:45:24,410 epoch 7 - iter 294/1476 - loss 0.28105637 - time (sec): 4.72 - samples/sec: 6999.27 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:45:26,729 epoch 7 - iter 441/1476 - loss 0.26761446 - time (sec): 7.04 - samples/sec: 6984.77 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:45:29,056 epoch 7 - iter 588/1476 - loss 0.27708379 - time (sec): 9.37 - samples/sec: 6995.49 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:45:31,434 epoch 7 - iter 735/1476 - loss 0.27588589 - time (sec): 11.74 - samples/sec: 7045.15 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:45:33,886 epoch 7 - iter 882/1476 - loss 0.27477675 - time (sec): 14.20 - samples/sec: 6994.89 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:45:36,251 epoch 7 - iter 1029/1476 - loss 0.27432064 - time (sec): 16.56 - samples/sec: 6982.17 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:45:38,753 epoch 7 - iter 1176/1476 - loss 0.27174772 - time (sec): 19.06 - samples/sec: 6939.55 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:45:41,236 epoch 7 - iter 1323/1476 - loss 0.27271128 - time (sec): 21.55 - samples/sec: 6912.42 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:45:43,620 epoch 7 - iter 1470/1476 - loss 0.27459010 - time (sec): 23.93 - samples/sec: 6932.62 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:45:43,716 ----------------------------------------------------------------------------------------------------
2023-10-18 18:45:43,717 EPOCH 7 done: loss 0.2748 - lr: 0.000010
2023-10-18 18:45:50,837 DEV : loss 0.2561444044113159 - f1-score (micro avg) 0.4575
2023-10-18 18:45:50,862 saving best model
2023-10-18 18:45:50,900 ----------------------------------------------------------------------------------------------------
2023-10-18 18:45:53,369 epoch 8 - iter 147/1476 - loss 0.27413252 - time (sec): 2.47 - samples/sec: 8275.39 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:45:55,733 epoch 8 - iter 294/1476 - loss 0.27388149 - time (sec): 4.83 - samples/sec: 7719.31 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:45:58,088 epoch 8 - iter 441/1476 - loss 0.27274651 - time (sec): 7.19 - samples/sec: 7394.54 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:46:00,809 epoch 8 - iter 588/1476 - loss 0.27643641 - time (sec): 9.91 - samples/sec: 7042.61 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:46:03,084 epoch 8 - iter 735/1476 - loss 0.27009636 - time (sec): 12.18 - samples/sec: 6960.18 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:46:05,465 epoch 8 - iter 882/1476 - loss 0.26957002 - time (sec): 14.56 - samples/sec: 6919.68 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:46:07,775 epoch 8 - iter 1029/1476 - loss 0.26747275 - time (sec): 16.87 - samples/sec: 6925.56 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:46:10,253 epoch 8 - iter 1176/1476 - loss 0.26851629 - time (sec): 19.35 - samples/sec: 6915.17 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:46:12,590 epoch 8 - iter 1323/1476 - loss 0.26905409 - time (sec): 21.69 - samples/sec: 6894.10 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:46:14,955 epoch 8 - iter 1470/1476 - loss 0.26863538 - time (sec): 24.05 - samples/sec: 6896.96 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:46:15,045 ----------------------------------------------------------------------------------------------------
2023-10-18 18:46:15,045 EPOCH 8 done: loss 0.2688 - lr: 0.000007
2023-10-18 18:46:22,285 DEV : loss 0.24966885149478912 - f1-score (micro avg) 0.4711
2023-10-18 18:46:22,310 saving best model
2023-10-18 18:46:22,350 ----------------------------------------------------------------------------------------------------
2023-10-18 18:46:24,624 epoch 9 - iter 147/1476 - loss 0.26497575 - time (sec): 2.27 - samples/sec: 6837.90 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:46:26,935 epoch 9 - iter 294/1476 - loss 0.25295697 - time (sec): 4.58 - samples/sec: 6872.19 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:46:29,313 epoch 9 - iter 441/1476 - loss 0.26741752 - time (sec): 6.96 - samples/sec: 7248.47 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:46:31,663 epoch 9 - iter 588/1476 - loss 0.26622468 - time (sec): 9.31 - samples/sec: 7194.86 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:46:33,964 epoch 9 - iter 735/1476 - loss 0.26781353 - time (sec): 11.61 - samples/sec: 7123.94 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:46:36,257 epoch 9 - iter 882/1476 - loss 0.26776521 - time (sec): 13.91 - samples/sec: 7083.45 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:46:38,629 epoch 9 - iter 1029/1476 - loss 0.26644524 - time (sec): 16.28 - samples/sec: 7099.69 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:46:40,957 epoch 9 - iter 1176/1476 - loss 0.26542276 - time (sec): 18.61 - samples/sec: 7091.97 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:46:43,289 epoch 9 - iter 1323/1476 - loss 0.26480558 - time (sec): 20.94 - samples/sec: 7023.52 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:46:45,685 epoch 9 - iter 1470/1476 - loss 0.26287148 - time (sec): 23.33 - samples/sec: 7090.80 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:46:45,789 ----------------------------------------------------------------------------------------------------
2023-10-18 18:46:45,789 EPOCH 9 done: loss 0.2622 - lr: 0.000003
2023-10-18 18:46:52,888 DEV : loss 0.2507059872150421 - f1-score (micro avg) 0.4744
2023-10-18 18:46:52,913 saving best model
2023-10-18 18:46:52,955 ----------------------------------------------------------------------------------------------------
2023-10-18 18:46:55,376 epoch 10 - iter 147/1476 - loss 0.22586996 - time (sec): 2.42 - samples/sec: 7307.73 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:46:57,716 epoch 10 - iter 294/1476 - loss 0.23115650 - time (sec): 4.76 - samples/sec: 7185.30 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:47:00,090 epoch 10 - iter 441/1476 - loss 0.24322112 - time (sec): 7.13 - samples/sec: 7118.39 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:47:02,428 epoch 10 - iter 588/1476 - loss 0.25190140 - time (sec): 9.47 - samples/sec: 6993.68 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:47:04,802 epoch 10 - iter 735/1476 - loss 0.25558653 - time (sec): 11.85 - samples/sec: 7004.81 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:47:07,197 epoch 10 - iter 882/1476 - loss 0.25425158 - time (sec): 14.24 - samples/sec: 7056.10 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:47:09,373 epoch 10 - iter 1029/1476 - loss 0.25465722 - time (sec): 16.42 - samples/sec: 7091.58 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:47:11,469 epoch 10 - iter 1176/1476 - loss 0.25966368 - time (sec): 18.51 - samples/sec: 7244.08 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:47:13,530 epoch 10 - iter 1323/1476 - loss 0.25860524 - time (sec): 20.57 - samples/sec: 7315.70 - lr: 0.000000 - momentum: 0.000000
2023-10-18 18:47:15,573 epoch 10 - iter 1470/1476 - loss 0.25742169 - time (sec): 22.62 - samples/sec: 7336.12 - lr: 0.000000 - momentum: 0.000000
2023-10-18 18:47:15,652 ----------------------------------------------------------------------------------------------------
2023-10-18 18:47:15,652 EPOCH 10 done: loss 0.2572 - lr: 0.000000
2023-10-18 18:47:22,783 DEV : loss 0.24806223809719086 - f1-score (micro avg) 0.4774
2023-10-18 18:47:22,808 saving best model
2023-10-18 18:47:22,872 ----------------------------------------------------------------------------------------------------
2023-10-18 18:47:22,872 Loading model from best epoch ...
2023-10-18 18:47:22,947 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-18 18:47:25,503
Results:
- F-score (micro) 0.5021
- F-score (macro) 0.2868
- Accuracy 0.3569
By class:
precision recall f1-score support
loc 0.5109 0.7343 0.6026 858
pers 0.3793 0.4767 0.4224 537
org 0.2000 0.0303 0.0526 132
time 0.3830 0.3333 0.3564 54
prod 0.0000 0.0000 0.0000 61
micro avg 0.4597 0.5530 0.5021 1642
macro avg 0.2946 0.3149 0.2868 1642
weighted avg 0.4197 0.5530 0.4690 1642
2023-10-18 18:47:25,503 ----------------------------------------------------------------------------------------------------