stefan-it's picture
Upload folder using huggingface_hub
214d0ef
2023-10-16 18:20:17,455 ----------------------------------------------------------------------------------------------------
2023-10-16 18:20:17,456 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 18:20:17,456 ----------------------------------------------------------------------------------------------------
2023-10-16 18:20:17,456 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-16 18:20:17,456 ----------------------------------------------------------------------------------------------------
2023-10-16 18:20:17,456 Train: 1166 sentences
2023-10-16 18:20:17,456 (train_with_dev=False, train_with_test=False)
2023-10-16 18:20:17,456 ----------------------------------------------------------------------------------------------------
2023-10-16 18:20:17,456 Training Params:
2023-10-16 18:20:17,456 - learning_rate: "3e-05"
2023-10-16 18:20:17,456 - mini_batch_size: "4"
2023-10-16 18:20:17,456 - max_epochs: "10"
2023-10-16 18:20:17,456 - shuffle: "True"
2023-10-16 18:20:17,456 ----------------------------------------------------------------------------------------------------
2023-10-16 18:20:17,456 Plugins:
2023-10-16 18:20:17,456 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 18:20:17,456 ----------------------------------------------------------------------------------------------------
2023-10-16 18:20:17,456 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 18:20:17,456 - metric: "('micro avg', 'f1-score')"
2023-10-16 18:20:17,456 ----------------------------------------------------------------------------------------------------
2023-10-16 18:20:17,456 Computation:
2023-10-16 18:20:17,456 - compute on device: cuda:0
2023-10-16 18:20:17,456 - embedding storage: none
2023-10-16 18:20:17,457 ----------------------------------------------------------------------------------------------------
2023-10-16 18:20:17,457 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-16 18:20:17,457 ----------------------------------------------------------------------------------------------------
2023-10-16 18:20:17,457 ----------------------------------------------------------------------------------------------------
2023-10-16 18:20:19,012 epoch 1 - iter 29/292 - loss 2.89452854 - time (sec): 1.55 - samples/sec: 2610.85 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:20:20,543 epoch 1 - iter 58/292 - loss 2.56030484 - time (sec): 3.09 - samples/sec: 2444.98 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:20:22,417 epoch 1 - iter 87/292 - loss 1.72394676 - time (sec): 4.96 - samples/sec: 2616.09 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:20:23,974 epoch 1 - iter 116/292 - loss 1.47947989 - time (sec): 6.52 - samples/sec: 2614.50 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:20:25,566 epoch 1 - iter 145/292 - loss 1.33553521 - time (sec): 8.11 - samples/sec: 2601.11 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:20:27,124 epoch 1 - iter 174/292 - loss 1.19681158 - time (sec): 9.67 - samples/sec: 2559.05 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:20:28,991 epoch 1 - iter 203/292 - loss 1.05794959 - time (sec): 11.53 - samples/sec: 2607.47 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:20:30,683 epoch 1 - iter 232/292 - loss 0.94552175 - time (sec): 13.23 - samples/sec: 2641.20 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:20:32,284 epoch 1 - iter 261/292 - loss 0.86717661 - time (sec): 14.83 - samples/sec: 2650.81 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:20:33,975 epoch 1 - iter 290/292 - loss 0.79892846 - time (sec): 16.52 - samples/sec: 2676.74 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:20:34,068 ----------------------------------------------------------------------------------------------------
2023-10-16 18:20:34,068 EPOCH 1 done: loss 0.7958 - lr: 0.000030
2023-10-16 18:20:35,329 DEV : loss 0.18650421500205994 - f1-score (micro avg) 0.5083
2023-10-16 18:20:35,336 saving best model
2023-10-16 18:20:35,710 ----------------------------------------------------------------------------------------------------
2023-10-16 18:20:37,446 epoch 2 - iter 29/292 - loss 0.22066234 - time (sec): 1.73 - samples/sec: 2805.89 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:20:39,219 epoch 2 - iter 58/292 - loss 0.20180903 - time (sec): 3.51 - samples/sec: 2757.79 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:20:40,760 epoch 2 - iter 87/292 - loss 0.21524233 - time (sec): 5.05 - samples/sec: 2703.85 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:20:42,386 epoch 2 - iter 116/292 - loss 0.20437921 - time (sec): 6.67 - samples/sec: 2637.60 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:20:44,125 epoch 2 - iter 145/292 - loss 0.20407849 - time (sec): 8.41 - samples/sec: 2633.71 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:20:45,791 epoch 2 - iter 174/292 - loss 0.21017489 - time (sec): 10.08 - samples/sec: 2673.66 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:20:47,443 epoch 2 - iter 203/292 - loss 0.20169462 - time (sec): 11.73 - samples/sec: 2676.79 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:20:48,926 epoch 2 - iter 232/292 - loss 0.19676618 - time (sec): 13.21 - samples/sec: 2666.39 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:20:50,536 epoch 2 - iter 261/292 - loss 0.19762036 - time (sec): 14.82 - samples/sec: 2697.55 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:20:52,157 epoch 2 - iter 290/292 - loss 0.19024544 - time (sec): 16.45 - samples/sec: 2696.27 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:20:52,241 ----------------------------------------------------------------------------------------------------
2023-10-16 18:20:52,241 EPOCH 2 done: loss 0.1899 - lr: 0.000027
2023-10-16 18:20:53,557 DEV : loss 0.12093638628721237 - f1-score (micro avg) 0.6793
2023-10-16 18:20:53,566 saving best model
2023-10-16 18:20:54,094 ----------------------------------------------------------------------------------------------------
2023-10-16 18:20:56,278 epoch 3 - iter 29/292 - loss 0.17753294 - time (sec): 2.18 - samples/sec: 2480.59 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:20:58,072 epoch 3 - iter 58/292 - loss 0.18357718 - time (sec): 3.98 - samples/sec: 2415.49 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:20:59,986 epoch 3 - iter 87/292 - loss 0.16005714 - time (sec): 5.89 - samples/sec: 2495.71 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:21:01,620 epoch 3 - iter 116/292 - loss 0.14422489 - time (sec): 7.52 - samples/sec: 2563.83 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:21:03,360 epoch 3 - iter 145/292 - loss 0.13857187 - time (sec): 9.26 - samples/sec: 2627.64 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:21:05,026 epoch 3 - iter 174/292 - loss 0.13043156 - time (sec): 10.93 - samples/sec: 2575.18 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:21:06,711 epoch 3 - iter 203/292 - loss 0.12565357 - time (sec): 12.62 - samples/sec: 2527.63 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:21:08,371 epoch 3 - iter 232/292 - loss 0.12072889 - time (sec): 14.27 - samples/sec: 2545.07 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:21:10,093 epoch 3 - iter 261/292 - loss 0.12031084 - time (sec): 16.00 - samples/sec: 2523.45 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:21:11,747 epoch 3 - iter 290/292 - loss 0.11654858 - time (sec): 17.65 - samples/sec: 2506.89 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:21:11,847 ----------------------------------------------------------------------------------------------------
2023-10-16 18:21:11,847 EPOCH 3 done: loss 0.1165 - lr: 0.000023
2023-10-16 18:21:13,137 DEV : loss 0.12300916761159897 - f1-score (micro avg) 0.6891
2023-10-16 18:21:13,142 saving best model
2023-10-16 18:21:13,598 ----------------------------------------------------------------------------------------------------
2023-10-16 18:21:15,278 epoch 4 - iter 29/292 - loss 0.07691043 - time (sec): 1.68 - samples/sec: 2351.61 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:21:17,065 epoch 4 - iter 58/292 - loss 0.07953942 - time (sec): 3.46 - samples/sec: 2375.70 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:21:18,849 epoch 4 - iter 87/292 - loss 0.08591602 - time (sec): 5.25 - samples/sec: 2352.10 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:21:20,722 epoch 4 - iter 116/292 - loss 0.08257757 - time (sec): 7.12 - samples/sec: 2360.53 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:21:22,305 epoch 4 - iter 145/292 - loss 0.07813624 - time (sec): 8.70 - samples/sec: 2409.82 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:21:23,960 epoch 4 - iter 174/292 - loss 0.07972020 - time (sec): 10.36 - samples/sec: 2466.85 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:21:25,683 epoch 4 - iter 203/292 - loss 0.08347839 - time (sec): 12.08 - samples/sec: 2440.27 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:21:27,464 epoch 4 - iter 232/292 - loss 0.08291309 - time (sec): 13.86 - samples/sec: 2459.88 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:21:29,179 epoch 4 - iter 261/292 - loss 0.08335854 - time (sec): 15.58 - samples/sec: 2471.62 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:21:31,135 epoch 4 - iter 290/292 - loss 0.07773530 - time (sec): 17.53 - samples/sec: 2523.16 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:21:31,236 ----------------------------------------------------------------------------------------------------
2023-10-16 18:21:31,236 EPOCH 4 done: loss 0.0777 - lr: 0.000020
2023-10-16 18:21:32,765 DEV : loss 0.12332110106945038 - f1-score (micro avg) 0.74
2023-10-16 18:21:32,770 saving best model
2023-10-16 18:21:33,320 ----------------------------------------------------------------------------------------------------
2023-10-16 18:21:35,152 epoch 5 - iter 29/292 - loss 0.07471586 - time (sec): 1.83 - samples/sec: 2357.99 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:21:36,892 epoch 5 - iter 58/292 - loss 0.06384738 - time (sec): 3.57 - samples/sec: 2411.66 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:21:38,709 epoch 5 - iter 87/292 - loss 0.05322672 - time (sec): 5.39 - samples/sec: 2459.72 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:21:40,429 epoch 5 - iter 116/292 - loss 0.05394182 - time (sec): 7.11 - samples/sec: 2408.65 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:21:42,049 epoch 5 - iter 145/292 - loss 0.05371064 - time (sec): 8.73 - samples/sec: 2473.41 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:21:43,756 epoch 5 - iter 174/292 - loss 0.05377309 - time (sec): 10.43 - samples/sec: 2470.82 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:21:45,483 epoch 5 - iter 203/292 - loss 0.05506603 - time (sec): 12.16 - samples/sec: 2490.49 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:21:47,281 epoch 5 - iter 232/292 - loss 0.05569026 - time (sec): 13.96 - samples/sec: 2523.27 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:21:49,057 epoch 5 - iter 261/292 - loss 0.05310602 - time (sec): 15.73 - samples/sec: 2513.82 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:21:50,796 epoch 5 - iter 290/292 - loss 0.05168977 - time (sec): 17.47 - samples/sec: 2532.96 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:21:50,895 ----------------------------------------------------------------------------------------------------
2023-10-16 18:21:50,895 EPOCH 5 done: loss 0.0515 - lr: 0.000017
2023-10-16 18:21:52,154 DEV : loss 0.12363986670970917 - f1-score (micro avg) 0.7598
2023-10-16 18:21:52,158 saving best model
2023-10-16 18:21:52,789 ----------------------------------------------------------------------------------------------------
2023-10-16 18:21:54,644 epoch 6 - iter 29/292 - loss 0.03999197 - time (sec): 1.85 - samples/sec: 2786.58 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:21:56,201 epoch 6 - iter 58/292 - loss 0.03737188 - time (sec): 3.41 - samples/sec: 2646.70 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:21:57,780 epoch 6 - iter 87/292 - loss 0.03607318 - time (sec): 4.99 - samples/sec: 2599.09 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:21:59,610 epoch 6 - iter 116/292 - loss 0.03250214 - time (sec): 6.82 - samples/sec: 2571.56 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:22:01,167 epoch 6 - iter 145/292 - loss 0.03164073 - time (sec): 8.38 - samples/sec: 2659.55 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:22:02,699 epoch 6 - iter 174/292 - loss 0.03264454 - time (sec): 9.91 - samples/sec: 2641.66 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:22:04,383 epoch 6 - iter 203/292 - loss 0.03064583 - time (sec): 11.59 - samples/sec: 2628.28 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:22:06,149 epoch 6 - iter 232/292 - loss 0.03250064 - time (sec): 13.36 - samples/sec: 2627.00 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:22:07,985 epoch 6 - iter 261/292 - loss 0.03958890 - time (sec): 15.19 - samples/sec: 2650.19 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:22:09,627 epoch 6 - iter 290/292 - loss 0.04044838 - time (sec): 16.84 - samples/sec: 2627.04 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:22:09,715 ----------------------------------------------------------------------------------------------------
2023-10-16 18:22:09,716 EPOCH 6 done: loss 0.0403 - lr: 0.000013
2023-10-16 18:22:10,948 DEV : loss 0.13180699944496155 - f1-score (micro avg) 0.75
2023-10-16 18:22:10,953 ----------------------------------------------------------------------------------------------------
2023-10-16 18:22:12,761 epoch 7 - iter 29/292 - loss 0.02941895 - time (sec): 1.81 - samples/sec: 3067.01 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:22:14,410 epoch 7 - iter 58/292 - loss 0.02180690 - time (sec): 3.46 - samples/sec: 2846.72 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:22:16,073 epoch 7 - iter 87/292 - loss 0.02244354 - time (sec): 5.12 - samples/sec: 2753.55 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:22:17,742 epoch 7 - iter 116/292 - loss 0.02270389 - time (sec): 6.79 - samples/sec: 2678.76 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:22:19,376 epoch 7 - iter 145/292 - loss 0.02838678 - time (sec): 8.42 - samples/sec: 2675.03 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:22:21,149 epoch 7 - iter 174/292 - loss 0.03006600 - time (sec): 10.19 - samples/sec: 2671.55 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:22:22,667 epoch 7 - iter 203/292 - loss 0.02892158 - time (sec): 11.71 - samples/sec: 2675.80 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:22:24,337 epoch 7 - iter 232/292 - loss 0.02871850 - time (sec): 13.38 - samples/sec: 2694.09 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:22:25,903 epoch 7 - iter 261/292 - loss 0.03081396 - time (sec): 14.95 - samples/sec: 2700.95 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:22:27,578 epoch 7 - iter 290/292 - loss 0.03280251 - time (sec): 16.62 - samples/sec: 2668.03 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:22:27,670 ----------------------------------------------------------------------------------------------------
2023-10-16 18:22:27,670 EPOCH 7 done: loss 0.0327 - lr: 0.000010
2023-10-16 18:22:28,952 DEV : loss 0.15362893044948578 - f1-score (micro avg) 0.7722
2023-10-16 18:22:28,957 saving best model
2023-10-16 18:22:29,505 ----------------------------------------------------------------------------------------------------
2023-10-16 18:22:31,139 epoch 8 - iter 29/292 - loss 0.02176979 - time (sec): 1.63 - samples/sec: 2657.39 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:22:32,842 epoch 8 - iter 58/292 - loss 0.01588976 - time (sec): 3.33 - samples/sec: 2742.97 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:22:34,581 epoch 8 - iter 87/292 - loss 0.01818793 - time (sec): 5.07 - samples/sec: 2642.71 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:22:36,177 epoch 8 - iter 116/292 - loss 0.01807534 - time (sec): 6.67 - samples/sec: 2591.67 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:22:37,886 epoch 8 - iter 145/292 - loss 0.01972452 - time (sec): 8.38 - samples/sec: 2628.33 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:22:39,553 epoch 8 - iter 174/292 - loss 0.02020388 - time (sec): 10.04 - samples/sec: 2629.04 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:22:41,327 epoch 8 - iter 203/292 - loss 0.02214309 - time (sec): 11.82 - samples/sec: 2598.89 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:22:43,085 epoch 8 - iter 232/292 - loss 0.02534168 - time (sec): 13.57 - samples/sec: 2619.77 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:22:44,552 epoch 8 - iter 261/292 - loss 0.02444304 - time (sec): 15.04 - samples/sec: 2607.05 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:22:46,287 epoch 8 - iter 290/292 - loss 0.02378208 - time (sec): 16.78 - samples/sec: 2631.90 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:22:46,394 ----------------------------------------------------------------------------------------------------
2023-10-16 18:22:46,395 EPOCH 8 done: loss 0.0239 - lr: 0.000007
2023-10-16 18:22:47,882 DEV : loss 0.15757989883422852 - f1-score (micro avg) 0.7409
2023-10-16 18:22:47,887 ----------------------------------------------------------------------------------------------------
2023-10-16 18:22:49,421 epoch 9 - iter 29/292 - loss 0.01149395 - time (sec): 1.53 - samples/sec: 2871.12 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:22:51,306 epoch 9 - iter 58/292 - loss 0.01863564 - time (sec): 3.42 - samples/sec: 2695.26 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:22:52,911 epoch 9 - iter 87/292 - loss 0.02194483 - time (sec): 5.02 - samples/sec: 2673.40 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:22:54,625 epoch 9 - iter 116/292 - loss 0.02014668 - time (sec): 6.74 - samples/sec: 2723.47 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:22:56,304 epoch 9 - iter 145/292 - loss 0.01858049 - time (sec): 8.42 - samples/sec: 2747.63 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:22:58,135 epoch 9 - iter 174/292 - loss 0.01892734 - time (sec): 10.25 - samples/sec: 2740.43 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:22:59,789 epoch 9 - iter 203/292 - loss 0.02059484 - time (sec): 11.90 - samples/sec: 2709.42 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:23:01,371 epoch 9 - iter 232/292 - loss 0.01991698 - time (sec): 13.48 - samples/sec: 2680.80 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:23:03,051 epoch 9 - iter 261/292 - loss 0.01931874 - time (sec): 15.16 - samples/sec: 2644.81 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:23:04,650 epoch 9 - iter 290/292 - loss 0.01784804 - time (sec): 16.76 - samples/sec: 2632.11 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:23:04,759 ----------------------------------------------------------------------------------------------------
2023-10-16 18:23:04,759 EPOCH 9 done: loss 0.0180 - lr: 0.000003
2023-10-16 18:23:06,017 DEV : loss 0.16514389216899872 - f1-score (micro avg) 0.742
2023-10-16 18:23:06,022 ----------------------------------------------------------------------------------------------------
2023-10-16 18:23:07,633 epoch 10 - iter 29/292 - loss 0.00947002 - time (sec): 1.61 - samples/sec: 2910.42 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:23:09,292 epoch 10 - iter 58/292 - loss 0.01129762 - time (sec): 3.27 - samples/sec: 2974.22 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:23:10,905 epoch 10 - iter 87/292 - loss 0.01868934 - time (sec): 4.88 - samples/sec: 2859.56 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:23:12,479 epoch 10 - iter 116/292 - loss 0.01805683 - time (sec): 6.46 - samples/sec: 2805.78 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:23:14,118 epoch 10 - iter 145/292 - loss 0.01706716 - time (sec): 8.10 - samples/sec: 2771.34 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:23:15,875 epoch 10 - iter 174/292 - loss 0.01588592 - time (sec): 9.85 - samples/sec: 2794.65 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:23:17,533 epoch 10 - iter 203/292 - loss 0.01507664 - time (sec): 11.51 - samples/sec: 2768.98 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:23:19,143 epoch 10 - iter 232/292 - loss 0.01490814 - time (sec): 13.12 - samples/sec: 2723.95 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:23:20,862 epoch 10 - iter 261/292 - loss 0.01407332 - time (sec): 14.84 - samples/sec: 2720.91 - lr: 0.000000 - momentum: 0.000000
2023-10-16 18:23:22,436 epoch 10 - iter 290/292 - loss 0.01532912 - time (sec): 16.41 - samples/sec: 2693.94 - lr: 0.000000 - momentum: 0.000000
2023-10-16 18:23:22,535 ----------------------------------------------------------------------------------------------------
2023-10-16 18:23:22,536 EPOCH 10 done: loss 0.0153 - lr: 0.000000
2023-10-16 18:23:23,815 DEV : loss 0.16304908692836761 - f1-score (micro avg) 0.7319
2023-10-16 18:23:24,214 ----------------------------------------------------------------------------------------------------
2023-10-16 18:23:24,216 Loading model from best epoch ...
2023-10-16 18:23:25,920 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-16 18:23:28,554
Results:
- F-score (micro) 0.7389
- F-score (macro) 0.6609
- Accuracy 0.609
By class:
precision recall f1-score support
PER 0.7801 0.8563 0.8164 348
LOC 0.6350 0.7931 0.7053 261
ORG 0.3750 0.3462 0.3600 52
HumanProd 0.8000 0.7273 0.7619 22
micro avg 0.6946 0.7892 0.7389 683
macro avg 0.6475 0.6807 0.6609 683
weighted avg 0.6944 0.7892 0.7375 683
2023-10-16 18:23:28,554 ----------------------------------------------------------------------------------------------------