stefan-it's picture
Upload folder using huggingface_hub
c338fc6
2023-10-14 21:54:59,595 ----------------------------------------------------------------------------------------------------
2023-10-14 21:54:59,596 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 21:54:59,596 ----------------------------------------------------------------------------------------------------
2023-10-14 21:54:59,596 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
2023-10-14 21:54:59,596 ----------------------------------------------------------------------------------------------------
2023-10-14 21:54:59,596 Train: 14465 sentences
2023-10-14 21:54:59,596 (train_with_dev=False, train_with_test=False)
2023-10-14 21:54:59,596 ----------------------------------------------------------------------------------------------------
2023-10-14 21:54:59,596 Training Params:
2023-10-14 21:54:59,596 - learning_rate: "3e-05"
2023-10-14 21:54:59,596 - mini_batch_size: "4"
2023-10-14 21:54:59,596 - max_epochs: "10"
2023-10-14 21:54:59,596 - shuffle: "True"
2023-10-14 21:54:59,596 ----------------------------------------------------------------------------------------------------
2023-10-14 21:54:59,596 Plugins:
2023-10-14 21:54:59,596 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 21:54:59,596 ----------------------------------------------------------------------------------------------------
2023-10-14 21:54:59,596 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 21:54:59,596 - metric: "('micro avg', 'f1-score')"
2023-10-14 21:54:59,596 ----------------------------------------------------------------------------------------------------
2023-10-14 21:54:59,596 Computation:
2023-10-14 21:54:59,596 - compute on device: cuda:0
2023-10-14 21:54:59,597 - embedding storage: none
2023-10-14 21:54:59,597 ----------------------------------------------------------------------------------------------------
2023-10-14 21:54:59,597 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-14 21:54:59,597 ----------------------------------------------------------------------------------------------------
2023-10-14 21:54:59,597 ----------------------------------------------------------------------------------------------------
2023-10-14 21:55:18,748 epoch 1 - iter 361/3617 - loss 1.37470380 - time (sec): 19.15 - samples/sec: 2000.76 - lr: 0.000003 - momentum: 0.000000
2023-10-14 21:55:36,278 epoch 1 - iter 722/3617 - loss 0.80586607 - time (sec): 36.68 - samples/sec: 2074.90 - lr: 0.000006 - momentum: 0.000000
2023-10-14 21:55:52,539 epoch 1 - iter 1083/3617 - loss 0.59505650 - time (sec): 52.94 - samples/sec: 2137.86 - lr: 0.000009 - momentum: 0.000000
2023-10-14 21:56:08,972 epoch 1 - iter 1444/3617 - loss 0.47633157 - time (sec): 69.37 - samples/sec: 2200.13 - lr: 0.000012 - momentum: 0.000000
2023-10-14 21:56:25,348 epoch 1 - iter 1805/3617 - loss 0.40695084 - time (sec): 85.75 - samples/sec: 2221.40 - lr: 0.000015 - momentum: 0.000000
2023-10-14 21:56:41,760 epoch 1 - iter 2166/3617 - loss 0.35885887 - time (sec): 102.16 - samples/sec: 2250.04 - lr: 0.000018 - momentum: 0.000000
2023-10-14 21:56:57,915 epoch 1 - iter 2527/3617 - loss 0.32613123 - time (sec): 118.32 - samples/sec: 2256.83 - lr: 0.000021 - momentum: 0.000000
2023-10-14 21:57:14,256 epoch 1 - iter 2888/3617 - loss 0.29928065 - time (sec): 134.66 - samples/sec: 2261.76 - lr: 0.000024 - momentum: 0.000000
2023-10-14 21:57:30,968 epoch 1 - iter 3249/3617 - loss 0.27794024 - time (sec): 151.37 - samples/sec: 2260.71 - lr: 0.000027 - momentum: 0.000000
2023-10-14 21:57:47,160 epoch 1 - iter 3610/3617 - loss 0.26205461 - time (sec): 167.56 - samples/sec: 2263.05 - lr: 0.000030 - momentum: 0.000000
2023-10-14 21:57:47,478 ----------------------------------------------------------------------------------------------------
2023-10-14 21:57:47,478 EPOCH 1 done: loss 0.2617 - lr: 0.000030
2023-10-14 21:57:52,218 DEV : loss 0.12341408431529999 - f1-score (micro avg) 0.5943
2023-10-14 21:57:52,255 saving best model
2023-10-14 21:57:52,708 ----------------------------------------------------------------------------------------------------
2023-10-14 21:58:10,216 epoch 2 - iter 361/3617 - loss 0.10403458 - time (sec): 17.51 - samples/sec: 2122.79 - lr: 0.000030 - momentum: 0.000000
2023-10-14 21:58:26,642 epoch 2 - iter 722/3617 - loss 0.10235270 - time (sec): 33.93 - samples/sec: 2224.39 - lr: 0.000029 - momentum: 0.000000
2023-10-14 21:58:43,240 epoch 2 - iter 1083/3617 - loss 0.10201030 - time (sec): 50.53 - samples/sec: 2254.16 - lr: 0.000029 - momentum: 0.000000
2023-10-14 21:58:59,738 epoch 2 - iter 1444/3617 - loss 0.10128214 - time (sec): 67.03 - samples/sec: 2281.33 - lr: 0.000029 - momentum: 0.000000
2023-10-14 21:59:15,971 epoch 2 - iter 1805/3617 - loss 0.09923902 - time (sec): 83.26 - samples/sec: 2304.80 - lr: 0.000028 - momentum: 0.000000
2023-10-14 21:59:32,225 epoch 2 - iter 2166/3617 - loss 0.09995662 - time (sec): 99.51 - samples/sec: 2306.90 - lr: 0.000028 - momentum: 0.000000
2023-10-14 21:59:48,412 epoch 2 - iter 2527/3617 - loss 0.10116313 - time (sec): 115.70 - samples/sec: 2306.41 - lr: 0.000028 - momentum: 0.000000
2023-10-14 22:00:04,589 epoch 2 - iter 2888/3617 - loss 0.10129391 - time (sec): 131.88 - samples/sec: 2296.43 - lr: 0.000027 - momentum: 0.000000
2023-10-14 22:00:21,042 epoch 2 - iter 3249/3617 - loss 0.09991245 - time (sec): 148.33 - samples/sec: 2303.58 - lr: 0.000027 - momentum: 0.000000
2023-10-14 22:00:37,242 epoch 2 - iter 3610/3617 - loss 0.09941004 - time (sec): 164.53 - samples/sec: 2305.17 - lr: 0.000027 - momentum: 0.000000
2023-10-14 22:00:37,554 ----------------------------------------------------------------------------------------------------
2023-10-14 22:00:37,554 EPOCH 2 done: loss 0.0993 - lr: 0.000027
2023-10-14 22:00:43,848 DEV : loss 0.1348618119955063 - f1-score (micro avg) 0.6494
2023-10-14 22:00:43,876 saving best model
2023-10-14 22:00:44,365 ----------------------------------------------------------------------------------------------------
2023-10-14 22:01:00,728 epoch 3 - iter 361/3617 - loss 0.06498786 - time (sec): 16.36 - samples/sec: 2373.91 - lr: 0.000026 - momentum: 0.000000
2023-10-14 22:01:16,889 epoch 3 - iter 722/3617 - loss 0.07260502 - time (sec): 32.52 - samples/sec: 2355.96 - lr: 0.000026 - momentum: 0.000000
2023-10-14 22:01:33,116 epoch 3 - iter 1083/3617 - loss 0.07621223 - time (sec): 48.75 - samples/sec: 2337.97 - lr: 0.000026 - momentum: 0.000000
2023-10-14 22:01:49,355 epoch 3 - iter 1444/3617 - loss 0.07608788 - time (sec): 64.98 - samples/sec: 2351.04 - lr: 0.000025 - momentum: 0.000000
2023-10-14 22:02:05,843 epoch 3 - iter 1805/3617 - loss 0.07598984 - time (sec): 81.47 - samples/sec: 2333.63 - lr: 0.000025 - momentum: 0.000000
2023-10-14 22:02:21,958 epoch 3 - iter 2166/3617 - loss 0.07775218 - time (sec): 97.59 - samples/sec: 2335.11 - lr: 0.000025 - momentum: 0.000000
2023-10-14 22:02:38,247 epoch 3 - iter 2527/3617 - loss 0.07602466 - time (sec): 113.88 - samples/sec: 2332.17 - lr: 0.000024 - momentum: 0.000000
2023-10-14 22:02:54,496 epoch 3 - iter 2888/3617 - loss 0.07632851 - time (sec): 130.13 - samples/sec: 2327.74 - lr: 0.000024 - momentum: 0.000000
2023-10-14 22:03:10,904 epoch 3 - iter 3249/3617 - loss 0.07723473 - time (sec): 146.53 - samples/sec: 2329.75 - lr: 0.000024 - momentum: 0.000000
2023-10-14 22:03:27,104 epoch 3 - iter 3610/3617 - loss 0.07511861 - time (sec): 162.73 - samples/sec: 2330.38 - lr: 0.000023 - momentum: 0.000000
2023-10-14 22:03:27,411 ----------------------------------------------------------------------------------------------------
2023-10-14 22:03:27,411 EPOCH 3 done: loss 0.0750 - lr: 0.000023
2023-10-14 22:03:33,762 DEV : loss 0.20539723336696625 - f1-score (micro avg) 0.6436
2023-10-14 22:03:33,797 ----------------------------------------------------------------------------------------------------
2023-10-14 22:03:50,426 epoch 4 - iter 361/3617 - loss 0.04544183 - time (sec): 16.63 - samples/sec: 2346.47 - lr: 0.000023 - momentum: 0.000000
2023-10-14 22:04:06,629 epoch 4 - iter 722/3617 - loss 0.04633691 - time (sec): 32.83 - samples/sec: 2331.87 - lr: 0.000023 - momentum: 0.000000
2023-10-14 22:04:22,959 epoch 4 - iter 1083/3617 - loss 0.04843502 - time (sec): 49.16 - samples/sec: 2334.59 - lr: 0.000022 - momentum: 0.000000
2023-10-14 22:04:39,149 epoch 4 - iter 1444/3617 - loss 0.05354134 - time (sec): 65.35 - samples/sec: 2319.23 - lr: 0.000022 - momentum: 0.000000
2023-10-14 22:04:55,305 epoch 4 - iter 1805/3617 - loss 0.05419777 - time (sec): 81.51 - samples/sec: 2319.80 - lr: 0.000022 - momentum: 0.000000
2023-10-14 22:05:11,659 epoch 4 - iter 2166/3617 - loss 0.05236216 - time (sec): 97.86 - samples/sec: 2323.42 - lr: 0.000021 - momentum: 0.000000
2023-10-14 22:05:27,796 epoch 4 - iter 2527/3617 - loss 0.05247379 - time (sec): 114.00 - samples/sec: 2328.95 - lr: 0.000021 - momentum: 0.000000
2023-10-14 22:05:44,068 epoch 4 - iter 2888/3617 - loss 0.05210563 - time (sec): 130.27 - samples/sec: 2335.22 - lr: 0.000021 - momentum: 0.000000
2023-10-14 22:06:00,162 epoch 4 - iter 3249/3617 - loss 0.05267914 - time (sec): 146.36 - samples/sec: 2334.44 - lr: 0.000020 - momentum: 0.000000
2023-10-14 22:06:16,279 epoch 4 - iter 3610/3617 - loss 0.05369010 - time (sec): 162.48 - samples/sec: 2333.65 - lr: 0.000020 - momentum: 0.000000
2023-10-14 22:06:16,593 ----------------------------------------------------------------------------------------------------
2023-10-14 22:06:16,593 EPOCH 4 done: loss 0.0537 - lr: 0.000020
2023-10-14 22:06:24,103 DEV : loss 0.22502246499061584 - f1-score (micro avg) 0.6245
2023-10-14 22:06:24,150 ----------------------------------------------------------------------------------------------------
2023-10-14 22:06:42,193 epoch 5 - iter 361/3617 - loss 0.03757363 - time (sec): 18.04 - samples/sec: 2095.78 - lr: 0.000020 - momentum: 0.000000
2023-10-14 22:06:58,597 epoch 5 - iter 722/3617 - loss 0.03566151 - time (sec): 34.44 - samples/sec: 2204.45 - lr: 0.000019 - momentum: 0.000000
2023-10-14 22:07:14,945 epoch 5 - iter 1083/3617 - loss 0.03610137 - time (sec): 50.79 - samples/sec: 2237.25 - lr: 0.000019 - momentum: 0.000000
2023-10-14 22:07:31,411 epoch 5 - iter 1444/3617 - loss 0.03673278 - time (sec): 67.26 - samples/sec: 2238.77 - lr: 0.000019 - momentum: 0.000000
2023-10-14 22:07:47,752 epoch 5 - iter 1805/3617 - loss 0.03672910 - time (sec): 83.60 - samples/sec: 2251.84 - lr: 0.000018 - momentum: 0.000000
2023-10-14 22:08:04,113 epoch 5 - iter 2166/3617 - loss 0.03708856 - time (sec): 99.96 - samples/sec: 2255.25 - lr: 0.000018 - momentum: 0.000000
2023-10-14 22:08:20,339 epoch 5 - iter 2527/3617 - loss 0.03713523 - time (sec): 116.19 - samples/sec: 2258.75 - lr: 0.000018 - momentum: 0.000000
2023-10-14 22:08:36,957 epoch 5 - iter 2888/3617 - loss 0.03750961 - time (sec): 132.80 - samples/sec: 2275.54 - lr: 0.000017 - momentum: 0.000000
2023-10-14 22:08:53,212 epoch 5 - iter 3249/3617 - loss 0.03760435 - time (sec): 149.06 - samples/sec: 2285.38 - lr: 0.000017 - momentum: 0.000000
2023-10-14 22:09:09,636 epoch 5 - iter 3610/3617 - loss 0.03853007 - time (sec): 165.48 - samples/sec: 2292.93 - lr: 0.000017 - momentum: 0.000000
2023-10-14 22:09:09,948 ----------------------------------------------------------------------------------------------------
2023-10-14 22:09:09,948 EPOCH 5 done: loss 0.0385 - lr: 0.000017
2023-10-14 22:09:17,131 DEV : loss 0.2994045317173004 - f1-score (micro avg) 0.6358
2023-10-14 22:09:17,161 ----------------------------------------------------------------------------------------------------
2023-10-14 22:09:33,755 epoch 6 - iter 361/3617 - loss 0.02535187 - time (sec): 16.59 - samples/sec: 2169.49 - lr: 0.000016 - momentum: 0.000000
2023-10-14 22:09:50,064 epoch 6 - iter 722/3617 - loss 0.02736771 - time (sec): 32.90 - samples/sec: 2260.33 - lr: 0.000016 - momentum: 0.000000
2023-10-14 22:10:06,390 epoch 6 - iter 1083/3617 - loss 0.02713759 - time (sec): 49.23 - samples/sec: 2260.87 - lr: 0.000016 - momentum: 0.000000
2023-10-14 22:10:22,740 epoch 6 - iter 1444/3617 - loss 0.02699120 - time (sec): 65.58 - samples/sec: 2275.19 - lr: 0.000015 - momentum: 0.000000
2023-10-14 22:10:39,484 epoch 6 - iter 1805/3617 - loss 0.02728068 - time (sec): 82.32 - samples/sec: 2282.65 - lr: 0.000015 - momentum: 0.000000
2023-10-14 22:10:57,852 epoch 6 - iter 2166/3617 - loss 0.02913853 - time (sec): 100.69 - samples/sec: 2246.59 - lr: 0.000015 - momentum: 0.000000
2023-10-14 22:11:17,253 epoch 6 - iter 2527/3617 - loss 0.02836692 - time (sec): 120.09 - samples/sec: 2209.10 - lr: 0.000014 - momentum: 0.000000
2023-10-14 22:11:36,100 epoch 6 - iter 2888/3617 - loss 0.02781183 - time (sec): 138.94 - samples/sec: 2194.97 - lr: 0.000014 - momentum: 0.000000
2023-10-14 22:11:54,931 epoch 6 - iter 3249/3617 - loss 0.02758622 - time (sec): 157.77 - samples/sec: 2164.32 - lr: 0.000014 - momentum: 0.000000
2023-10-14 22:12:13,902 epoch 6 - iter 3610/3617 - loss 0.02692781 - time (sec): 176.74 - samples/sec: 2145.77 - lr: 0.000013 - momentum: 0.000000
2023-10-14 22:12:14,277 ----------------------------------------------------------------------------------------------------
2023-10-14 22:12:14,277 EPOCH 6 done: loss 0.0269 - lr: 0.000013
2023-10-14 22:12:19,958 DEV : loss 0.337155818939209 - f1-score (micro avg) 0.6257
2023-10-14 22:12:20,003 ----------------------------------------------------------------------------------------------------
2023-10-14 22:12:37,007 epoch 7 - iter 361/3617 - loss 0.01474905 - time (sec): 17.00 - samples/sec: 2190.69 - lr: 0.000013 - momentum: 0.000000
2023-10-14 22:12:53,626 epoch 7 - iter 722/3617 - loss 0.01455228 - time (sec): 33.62 - samples/sec: 2278.91 - lr: 0.000013 - momentum: 0.000000
2023-10-14 22:13:09,984 epoch 7 - iter 1083/3617 - loss 0.01703615 - time (sec): 49.98 - samples/sec: 2320.82 - lr: 0.000012 - momentum: 0.000000
2023-10-14 22:13:26,219 epoch 7 - iter 1444/3617 - loss 0.01832267 - time (sec): 66.21 - samples/sec: 2305.64 - lr: 0.000012 - momentum: 0.000000
2023-10-14 22:13:42,728 epoch 7 - iter 1805/3617 - loss 0.02031330 - time (sec): 82.72 - samples/sec: 2302.63 - lr: 0.000012 - momentum: 0.000000
2023-10-14 22:13:58,984 epoch 7 - iter 2166/3617 - loss 0.02071575 - time (sec): 98.98 - samples/sec: 2307.64 - lr: 0.000011 - momentum: 0.000000
2023-10-14 22:14:15,313 epoch 7 - iter 2527/3617 - loss 0.02056007 - time (sec): 115.31 - samples/sec: 2319.44 - lr: 0.000011 - momentum: 0.000000
2023-10-14 22:14:31,743 epoch 7 - iter 2888/3617 - loss 0.01989770 - time (sec): 131.74 - samples/sec: 2301.33 - lr: 0.000011 - momentum: 0.000000
2023-10-14 22:14:48,890 epoch 7 - iter 3249/3617 - loss 0.01977656 - time (sec): 148.88 - samples/sec: 2293.85 - lr: 0.000010 - momentum: 0.000000
2023-10-14 22:15:06,207 epoch 7 - iter 3610/3617 - loss 0.01987886 - time (sec): 166.20 - samples/sec: 2281.30 - lr: 0.000010 - momentum: 0.000000
2023-10-14 22:15:06,529 ----------------------------------------------------------------------------------------------------
2023-10-14 22:15:06,530 EPOCH 7 done: loss 0.0198 - lr: 0.000010
2023-10-14 22:15:13,093 DEV : loss 0.3545047342777252 - f1-score (micro avg) 0.6418
2023-10-14 22:15:13,126 ----------------------------------------------------------------------------------------------------
2023-10-14 22:15:30,580 epoch 8 - iter 361/3617 - loss 0.01017045 - time (sec): 17.45 - samples/sec: 2112.81 - lr: 0.000010 - momentum: 0.000000
2023-10-14 22:15:47,501 epoch 8 - iter 722/3617 - loss 0.01126589 - time (sec): 34.37 - samples/sec: 2200.48 - lr: 0.000009 - momentum: 0.000000
2023-10-14 22:16:04,038 epoch 8 - iter 1083/3617 - loss 0.01307129 - time (sec): 50.91 - samples/sec: 2233.78 - lr: 0.000009 - momentum: 0.000000
2023-10-14 22:16:20,600 epoch 8 - iter 1444/3617 - loss 0.01282998 - time (sec): 67.47 - samples/sec: 2244.92 - lr: 0.000009 - momentum: 0.000000
2023-10-14 22:16:37,101 epoch 8 - iter 1805/3617 - loss 0.01232805 - time (sec): 83.97 - samples/sec: 2244.87 - lr: 0.000008 - momentum: 0.000000
2023-10-14 22:16:53,501 epoch 8 - iter 2166/3617 - loss 0.01310305 - time (sec): 100.37 - samples/sec: 2262.49 - lr: 0.000008 - momentum: 0.000000
2023-10-14 22:17:09,729 epoch 8 - iter 2527/3617 - loss 0.01331872 - time (sec): 116.60 - samples/sec: 2279.94 - lr: 0.000008 - momentum: 0.000000
2023-10-14 22:17:25,984 epoch 8 - iter 2888/3617 - loss 0.01411043 - time (sec): 132.86 - samples/sec: 2281.65 - lr: 0.000007 - momentum: 0.000000
2023-10-14 22:17:41,747 epoch 8 - iter 3249/3617 - loss 0.01398398 - time (sec): 148.62 - samples/sec: 2291.33 - lr: 0.000007 - momentum: 0.000000
2023-10-14 22:17:57,730 epoch 8 - iter 3610/3617 - loss 0.01442377 - time (sec): 164.60 - samples/sec: 2305.18 - lr: 0.000007 - momentum: 0.000000
2023-10-14 22:17:58,039 ----------------------------------------------------------------------------------------------------
2023-10-14 22:17:58,039 EPOCH 8 done: loss 0.0144 - lr: 0.000007
2023-10-14 22:18:04,449 DEV : loss 0.367567777633667 - f1-score (micro avg) 0.653
2023-10-14 22:18:04,480 saving best model
2023-10-14 22:18:04,985 ----------------------------------------------------------------------------------------------------
2023-10-14 22:18:21,415 epoch 9 - iter 361/3617 - loss 0.00607067 - time (sec): 16.43 - samples/sec: 2327.48 - lr: 0.000006 - momentum: 0.000000
2023-10-14 22:18:37,591 epoch 9 - iter 722/3617 - loss 0.00553435 - time (sec): 32.60 - samples/sec: 2324.37 - lr: 0.000006 - momentum: 0.000000
2023-10-14 22:18:53,803 epoch 9 - iter 1083/3617 - loss 0.00695481 - time (sec): 48.81 - samples/sec: 2326.73 - lr: 0.000006 - momentum: 0.000000
2023-10-14 22:19:10,128 epoch 9 - iter 1444/3617 - loss 0.00740150 - time (sec): 65.14 - samples/sec: 2339.99 - lr: 0.000005 - momentum: 0.000000
2023-10-14 22:19:26,337 epoch 9 - iter 1805/3617 - loss 0.00819984 - time (sec): 81.35 - samples/sec: 2342.15 - lr: 0.000005 - momentum: 0.000000
2023-10-14 22:19:42,694 epoch 9 - iter 2166/3617 - loss 0.00848387 - time (sec): 97.70 - samples/sec: 2352.83 - lr: 0.000005 - momentum: 0.000000
2023-10-14 22:19:58,978 epoch 9 - iter 2527/3617 - loss 0.00854654 - time (sec): 113.99 - samples/sec: 2347.77 - lr: 0.000004 - momentum: 0.000000
2023-10-14 22:20:15,184 epoch 9 - iter 2888/3617 - loss 0.00817439 - time (sec): 130.19 - samples/sec: 2338.55 - lr: 0.000004 - momentum: 0.000000
2023-10-14 22:20:31,446 epoch 9 - iter 3249/3617 - loss 0.00797782 - time (sec): 146.46 - samples/sec: 2339.00 - lr: 0.000004 - momentum: 0.000000
2023-10-14 22:20:47,549 epoch 9 - iter 3610/3617 - loss 0.00806657 - time (sec): 162.56 - samples/sec: 2333.16 - lr: 0.000003 - momentum: 0.000000
2023-10-14 22:20:47,856 ----------------------------------------------------------------------------------------------------
2023-10-14 22:20:47,856 EPOCH 9 done: loss 0.0081 - lr: 0.000003
2023-10-14 22:20:54,174 DEV : loss 0.39251258969306946 - f1-score (micro avg) 0.6402
2023-10-14 22:20:54,203 ----------------------------------------------------------------------------------------------------
2023-10-14 22:21:10,540 epoch 10 - iter 361/3617 - loss 0.00300548 - time (sec): 16.34 - samples/sec: 2337.28 - lr: 0.000003 - momentum: 0.000000
2023-10-14 22:21:26,713 epoch 10 - iter 722/3617 - loss 0.00462564 - time (sec): 32.51 - samples/sec: 2349.14 - lr: 0.000003 - momentum: 0.000000
2023-10-14 22:21:42,957 epoch 10 - iter 1083/3617 - loss 0.00462801 - time (sec): 48.75 - samples/sec: 2330.43 - lr: 0.000002 - momentum: 0.000000
2023-10-14 22:21:59,261 epoch 10 - iter 1444/3617 - loss 0.00469015 - time (sec): 65.06 - samples/sec: 2336.89 - lr: 0.000002 - momentum: 0.000000
2023-10-14 22:22:15,731 epoch 10 - iter 1805/3617 - loss 0.00462676 - time (sec): 81.53 - samples/sec: 2334.17 - lr: 0.000002 - momentum: 0.000000
2023-10-14 22:22:32,003 epoch 10 - iter 2166/3617 - loss 0.00461601 - time (sec): 97.80 - samples/sec: 2329.77 - lr: 0.000001 - momentum: 0.000000
2023-10-14 22:22:48,206 epoch 10 - iter 2527/3617 - loss 0.00485498 - time (sec): 114.00 - samples/sec: 2329.35 - lr: 0.000001 - momentum: 0.000000
2023-10-14 22:23:04,501 epoch 10 - iter 2888/3617 - loss 0.00441201 - time (sec): 130.30 - samples/sec: 2334.21 - lr: 0.000001 - momentum: 0.000000
2023-10-14 22:23:20,612 epoch 10 - iter 3249/3617 - loss 0.00517285 - time (sec): 146.41 - samples/sec: 2321.40 - lr: 0.000000 - momentum: 0.000000
2023-10-14 22:23:36,898 epoch 10 - iter 3610/3617 - loss 0.00509801 - time (sec): 162.69 - samples/sec: 2332.40 - lr: 0.000000 - momentum: 0.000000
2023-10-14 22:23:37,189 ----------------------------------------------------------------------------------------------------
2023-10-14 22:23:37,189 EPOCH 10 done: loss 0.0051 - lr: 0.000000
2023-10-14 22:23:42,818 DEV : loss 0.400580495595932 - f1-score (micro avg) 0.6444
2023-10-14 22:23:43,200 ----------------------------------------------------------------------------------------------------
2023-10-14 22:23:43,201 Loading model from best epoch ...
2023-10-14 22:23:45,437 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
2023-10-14 22:23:52,349
Results:
- F-score (micro) 0.6435
- F-score (macro) 0.5103
- Accuracy 0.4894
By class:
precision recall f1-score support
loc 0.6322 0.7851 0.7004 591
pers 0.5625 0.7311 0.6358 357
org 0.2000 0.1899 0.1948 79
micro avg 0.5813 0.7205 0.6435 1027
macro avg 0.4649 0.5687 0.5103 1027
weighted avg 0.5747 0.7205 0.6390 1027
2023-10-14 22:23:52,349 ----------------------------------------------------------------------------------------------------