2023-10-14 21:54:59,595 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:54:59,596 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-14 21:54:59,596 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:54:59,596 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences - NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator 2023-10-14 21:54:59,596 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:54:59,596 Train: 14465 sentences 2023-10-14 21:54:59,596 (train_with_dev=False, train_with_test=False) 2023-10-14 21:54:59,596 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:54:59,596 Training Params: 2023-10-14 21:54:59,596 - learning_rate: "3e-05" 2023-10-14 21:54:59,596 - mini_batch_size: "4" 2023-10-14 21:54:59,596 - max_epochs: "10" 2023-10-14 21:54:59,596 - shuffle: "True" 2023-10-14 21:54:59,596 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:54:59,596 Plugins: 2023-10-14 21:54:59,596 - LinearScheduler | warmup_fraction: '0.1' 2023-10-14 21:54:59,596 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:54:59,596 Final evaluation on model from best epoch (best-model.pt) 2023-10-14 21:54:59,596 - metric: "('micro avg', 'f1-score')" 2023-10-14 21:54:59,596 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:54:59,596 Computation: 2023-10-14 21:54:59,596 - compute on device: cuda:0 2023-10-14 21:54:59,597 - embedding storage: none 2023-10-14 21:54:59,597 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:54:59,597 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-14 21:54:59,597 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:54:59,597 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:55:18,748 epoch 1 - iter 361/3617 - loss 1.37470380 - time (sec): 19.15 - samples/sec: 2000.76 - lr: 0.000003 - momentum: 0.000000 2023-10-14 21:55:36,278 epoch 1 - iter 722/3617 - loss 0.80586607 - time (sec): 36.68 - samples/sec: 2074.90 - lr: 0.000006 - momentum: 0.000000 2023-10-14 21:55:52,539 epoch 1 - iter 1083/3617 - loss 0.59505650 - time (sec): 52.94 - samples/sec: 2137.86 - lr: 0.000009 - momentum: 0.000000 2023-10-14 21:56:08,972 epoch 1 - iter 1444/3617 - loss 0.47633157 - time (sec): 69.37 - samples/sec: 2200.13 - lr: 0.000012 - momentum: 0.000000 2023-10-14 21:56:25,348 epoch 1 - iter 1805/3617 - loss 0.40695084 - time (sec): 85.75 - samples/sec: 2221.40 - lr: 0.000015 - momentum: 0.000000 2023-10-14 21:56:41,760 epoch 1 - iter 2166/3617 - loss 0.35885887 - time (sec): 102.16 - samples/sec: 2250.04 - lr: 0.000018 - momentum: 0.000000 2023-10-14 21:56:57,915 epoch 1 - iter 2527/3617 - loss 0.32613123 - time (sec): 118.32 - samples/sec: 2256.83 - lr: 0.000021 - momentum: 0.000000 2023-10-14 21:57:14,256 epoch 1 - iter 2888/3617 - loss 0.29928065 - time (sec): 134.66 - samples/sec: 2261.76 - lr: 0.000024 - momentum: 0.000000 2023-10-14 21:57:30,968 epoch 1 - iter 3249/3617 - loss 0.27794024 - time (sec): 151.37 - samples/sec: 2260.71 - lr: 0.000027 - momentum: 0.000000 2023-10-14 21:57:47,160 epoch 1 - iter 3610/3617 - loss 0.26205461 - time (sec): 167.56 - samples/sec: 2263.05 - lr: 0.000030 - momentum: 0.000000 2023-10-14 21:57:47,478 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:57:47,478 EPOCH 1 done: loss 0.2617 - lr: 0.000030 2023-10-14 21:57:52,218 DEV : loss 0.12341408431529999 - f1-score (micro avg) 0.5943 2023-10-14 21:57:52,255 saving best model 2023-10-14 21:57:52,708 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:58:10,216 epoch 2 - iter 361/3617 - loss 0.10403458 - time (sec): 17.51 - samples/sec: 2122.79 - lr: 0.000030 - momentum: 0.000000 2023-10-14 21:58:26,642 epoch 2 - iter 722/3617 - loss 0.10235270 - time (sec): 33.93 - samples/sec: 2224.39 - lr: 0.000029 - momentum: 0.000000 2023-10-14 21:58:43,240 epoch 2 - iter 1083/3617 - loss 0.10201030 - time (sec): 50.53 - samples/sec: 2254.16 - lr: 0.000029 - momentum: 0.000000 2023-10-14 21:58:59,738 epoch 2 - iter 1444/3617 - loss 0.10128214 - time (sec): 67.03 - samples/sec: 2281.33 - lr: 0.000029 - momentum: 0.000000 2023-10-14 21:59:15,971 epoch 2 - iter 1805/3617 - loss 0.09923902 - time (sec): 83.26 - samples/sec: 2304.80 - lr: 0.000028 - momentum: 0.000000 2023-10-14 21:59:32,225 epoch 2 - iter 2166/3617 - loss 0.09995662 - time (sec): 99.51 - samples/sec: 2306.90 - lr: 0.000028 - momentum: 0.000000 2023-10-14 21:59:48,412 epoch 2 - iter 2527/3617 - loss 0.10116313 - time (sec): 115.70 - samples/sec: 2306.41 - lr: 0.000028 - momentum: 0.000000 2023-10-14 22:00:04,589 epoch 2 - iter 2888/3617 - loss 0.10129391 - time (sec): 131.88 - samples/sec: 2296.43 - lr: 0.000027 - momentum: 0.000000 2023-10-14 22:00:21,042 epoch 2 - iter 3249/3617 - loss 0.09991245 - time (sec): 148.33 - samples/sec: 2303.58 - lr: 0.000027 - momentum: 0.000000 2023-10-14 22:00:37,242 epoch 2 - iter 3610/3617 - loss 0.09941004 - time (sec): 164.53 - samples/sec: 2305.17 - lr: 0.000027 - momentum: 0.000000 2023-10-14 22:00:37,554 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:00:37,554 EPOCH 2 done: loss 0.0993 - lr: 0.000027 2023-10-14 22:00:43,848 DEV : loss 0.1348618119955063 - f1-score (micro avg) 0.6494 2023-10-14 22:00:43,876 saving best model 2023-10-14 22:00:44,365 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:01:00,728 epoch 3 - iter 361/3617 - loss 0.06498786 - time (sec): 16.36 - samples/sec: 2373.91 - lr: 0.000026 - momentum: 0.000000 2023-10-14 22:01:16,889 epoch 3 - iter 722/3617 - loss 0.07260502 - time (sec): 32.52 - samples/sec: 2355.96 - lr: 0.000026 - momentum: 0.000000 2023-10-14 22:01:33,116 epoch 3 - iter 1083/3617 - loss 0.07621223 - time (sec): 48.75 - samples/sec: 2337.97 - lr: 0.000026 - momentum: 0.000000 2023-10-14 22:01:49,355 epoch 3 - iter 1444/3617 - loss 0.07608788 - time (sec): 64.98 - samples/sec: 2351.04 - lr: 0.000025 - momentum: 0.000000 2023-10-14 22:02:05,843 epoch 3 - iter 1805/3617 - loss 0.07598984 - time (sec): 81.47 - samples/sec: 2333.63 - lr: 0.000025 - momentum: 0.000000 2023-10-14 22:02:21,958 epoch 3 - iter 2166/3617 - loss 0.07775218 - time (sec): 97.59 - samples/sec: 2335.11 - lr: 0.000025 - momentum: 0.000000 2023-10-14 22:02:38,247 epoch 3 - iter 2527/3617 - loss 0.07602466 - time (sec): 113.88 - samples/sec: 2332.17 - lr: 0.000024 - momentum: 0.000000 2023-10-14 22:02:54,496 epoch 3 - iter 2888/3617 - loss 0.07632851 - time (sec): 130.13 - samples/sec: 2327.74 - lr: 0.000024 - momentum: 0.000000 2023-10-14 22:03:10,904 epoch 3 - iter 3249/3617 - loss 0.07723473 - time (sec): 146.53 - samples/sec: 2329.75 - lr: 0.000024 - momentum: 0.000000 2023-10-14 22:03:27,104 epoch 3 - iter 3610/3617 - loss 0.07511861 - time (sec): 162.73 - samples/sec: 2330.38 - lr: 0.000023 - momentum: 0.000000 2023-10-14 22:03:27,411 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:03:27,411 EPOCH 3 done: loss 0.0750 - lr: 0.000023 2023-10-14 22:03:33,762 DEV : loss 0.20539723336696625 - f1-score (micro avg) 0.6436 2023-10-14 22:03:33,797 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:03:50,426 epoch 4 - iter 361/3617 - loss 0.04544183 - time (sec): 16.63 - samples/sec: 2346.47 - lr: 0.000023 - momentum: 0.000000 2023-10-14 22:04:06,629 epoch 4 - iter 722/3617 - loss 0.04633691 - time (sec): 32.83 - samples/sec: 2331.87 - lr: 0.000023 - momentum: 0.000000 2023-10-14 22:04:22,959 epoch 4 - iter 1083/3617 - loss 0.04843502 - time (sec): 49.16 - samples/sec: 2334.59 - lr: 0.000022 - momentum: 0.000000 2023-10-14 22:04:39,149 epoch 4 - iter 1444/3617 - loss 0.05354134 - time (sec): 65.35 - samples/sec: 2319.23 - lr: 0.000022 - momentum: 0.000000 2023-10-14 22:04:55,305 epoch 4 - iter 1805/3617 - loss 0.05419777 - time (sec): 81.51 - samples/sec: 2319.80 - lr: 0.000022 - momentum: 0.000000 2023-10-14 22:05:11,659 epoch 4 - iter 2166/3617 - loss 0.05236216 - time (sec): 97.86 - samples/sec: 2323.42 - lr: 0.000021 - momentum: 0.000000 2023-10-14 22:05:27,796 epoch 4 - iter 2527/3617 - loss 0.05247379 - time (sec): 114.00 - samples/sec: 2328.95 - lr: 0.000021 - momentum: 0.000000 2023-10-14 22:05:44,068 epoch 4 - iter 2888/3617 - loss 0.05210563 - time (sec): 130.27 - samples/sec: 2335.22 - lr: 0.000021 - momentum: 0.000000 2023-10-14 22:06:00,162 epoch 4 - iter 3249/3617 - loss 0.05267914 - time (sec): 146.36 - samples/sec: 2334.44 - lr: 0.000020 - momentum: 0.000000 2023-10-14 22:06:16,279 epoch 4 - iter 3610/3617 - loss 0.05369010 - time (sec): 162.48 - samples/sec: 2333.65 - lr: 0.000020 - momentum: 0.000000 2023-10-14 22:06:16,593 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:06:16,593 EPOCH 4 done: loss 0.0537 - lr: 0.000020 2023-10-14 22:06:24,103 DEV : loss 0.22502246499061584 - f1-score (micro avg) 0.6245 2023-10-14 22:06:24,150 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:06:42,193 epoch 5 - iter 361/3617 - loss 0.03757363 - time (sec): 18.04 - samples/sec: 2095.78 - lr: 0.000020 - momentum: 0.000000 2023-10-14 22:06:58,597 epoch 5 - iter 722/3617 - loss 0.03566151 - time (sec): 34.44 - samples/sec: 2204.45 - lr: 0.000019 - momentum: 0.000000 2023-10-14 22:07:14,945 epoch 5 - iter 1083/3617 - loss 0.03610137 - time (sec): 50.79 - samples/sec: 2237.25 - lr: 0.000019 - momentum: 0.000000 2023-10-14 22:07:31,411 epoch 5 - iter 1444/3617 - loss 0.03673278 - time (sec): 67.26 - samples/sec: 2238.77 - lr: 0.000019 - momentum: 0.000000 2023-10-14 22:07:47,752 epoch 5 - iter 1805/3617 - loss 0.03672910 - time (sec): 83.60 - samples/sec: 2251.84 - lr: 0.000018 - momentum: 0.000000 2023-10-14 22:08:04,113 epoch 5 - iter 2166/3617 - loss 0.03708856 - time (sec): 99.96 - samples/sec: 2255.25 - lr: 0.000018 - momentum: 0.000000 2023-10-14 22:08:20,339 epoch 5 - iter 2527/3617 - loss 0.03713523 - time (sec): 116.19 - samples/sec: 2258.75 - lr: 0.000018 - momentum: 0.000000 2023-10-14 22:08:36,957 epoch 5 - iter 2888/3617 - loss 0.03750961 - time (sec): 132.80 - samples/sec: 2275.54 - lr: 0.000017 - momentum: 0.000000 2023-10-14 22:08:53,212 epoch 5 - iter 3249/3617 - loss 0.03760435 - time (sec): 149.06 - samples/sec: 2285.38 - lr: 0.000017 - momentum: 0.000000 2023-10-14 22:09:09,636 epoch 5 - iter 3610/3617 - loss 0.03853007 - time (sec): 165.48 - samples/sec: 2292.93 - lr: 0.000017 - momentum: 0.000000 2023-10-14 22:09:09,948 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:09:09,948 EPOCH 5 done: loss 0.0385 - lr: 0.000017 2023-10-14 22:09:17,131 DEV : loss 0.2994045317173004 - f1-score (micro avg) 0.6358 2023-10-14 22:09:17,161 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:09:33,755 epoch 6 - iter 361/3617 - loss 0.02535187 - time (sec): 16.59 - samples/sec: 2169.49 - lr: 0.000016 - momentum: 0.000000 2023-10-14 22:09:50,064 epoch 6 - iter 722/3617 - loss 0.02736771 - time (sec): 32.90 - samples/sec: 2260.33 - lr: 0.000016 - momentum: 0.000000 2023-10-14 22:10:06,390 epoch 6 - iter 1083/3617 - loss 0.02713759 - time (sec): 49.23 - samples/sec: 2260.87 - lr: 0.000016 - momentum: 0.000000 2023-10-14 22:10:22,740 epoch 6 - iter 1444/3617 - loss 0.02699120 - time (sec): 65.58 - samples/sec: 2275.19 - lr: 0.000015 - momentum: 0.000000 2023-10-14 22:10:39,484 epoch 6 - iter 1805/3617 - loss 0.02728068 - time (sec): 82.32 - samples/sec: 2282.65 - lr: 0.000015 - momentum: 0.000000 2023-10-14 22:10:57,852 epoch 6 - iter 2166/3617 - loss 0.02913853 - time (sec): 100.69 - samples/sec: 2246.59 - lr: 0.000015 - momentum: 0.000000 2023-10-14 22:11:17,253 epoch 6 - iter 2527/3617 - loss 0.02836692 - time (sec): 120.09 - samples/sec: 2209.10 - lr: 0.000014 - momentum: 0.000000 2023-10-14 22:11:36,100 epoch 6 - iter 2888/3617 - loss 0.02781183 - time (sec): 138.94 - samples/sec: 2194.97 - lr: 0.000014 - momentum: 0.000000 2023-10-14 22:11:54,931 epoch 6 - iter 3249/3617 - loss 0.02758622 - time (sec): 157.77 - samples/sec: 2164.32 - lr: 0.000014 - momentum: 0.000000 2023-10-14 22:12:13,902 epoch 6 - iter 3610/3617 - loss 0.02692781 - time (sec): 176.74 - samples/sec: 2145.77 - lr: 0.000013 - momentum: 0.000000 2023-10-14 22:12:14,277 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:12:14,277 EPOCH 6 done: loss 0.0269 - lr: 0.000013 2023-10-14 22:12:19,958 DEV : loss 0.337155818939209 - f1-score (micro avg) 0.6257 2023-10-14 22:12:20,003 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:12:37,007 epoch 7 - iter 361/3617 - loss 0.01474905 - time (sec): 17.00 - samples/sec: 2190.69 - lr: 0.000013 - momentum: 0.000000 2023-10-14 22:12:53,626 epoch 7 - iter 722/3617 - loss 0.01455228 - time (sec): 33.62 - samples/sec: 2278.91 - lr: 0.000013 - momentum: 0.000000 2023-10-14 22:13:09,984 epoch 7 - iter 1083/3617 - loss 0.01703615 - time (sec): 49.98 - samples/sec: 2320.82 - lr: 0.000012 - momentum: 0.000000 2023-10-14 22:13:26,219 epoch 7 - iter 1444/3617 - loss 0.01832267 - time (sec): 66.21 - samples/sec: 2305.64 - lr: 0.000012 - momentum: 0.000000 2023-10-14 22:13:42,728 epoch 7 - iter 1805/3617 - loss 0.02031330 - time (sec): 82.72 - samples/sec: 2302.63 - lr: 0.000012 - momentum: 0.000000 2023-10-14 22:13:58,984 epoch 7 - iter 2166/3617 - loss 0.02071575 - time (sec): 98.98 - samples/sec: 2307.64 - lr: 0.000011 - momentum: 0.000000 2023-10-14 22:14:15,313 epoch 7 - iter 2527/3617 - loss 0.02056007 - time (sec): 115.31 - samples/sec: 2319.44 - lr: 0.000011 - momentum: 0.000000 2023-10-14 22:14:31,743 epoch 7 - iter 2888/3617 - loss 0.01989770 - time (sec): 131.74 - samples/sec: 2301.33 - lr: 0.000011 - momentum: 0.000000 2023-10-14 22:14:48,890 epoch 7 - iter 3249/3617 - loss 0.01977656 - time (sec): 148.88 - samples/sec: 2293.85 - lr: 0.000010 - momentum: 0.000000 2023-10-14 22:15:06,207 epoch 7 - iter 3610/3617 - loss 0.01987886 - time (sec): 166.20 - samples/sec: 2281.30 - lr: 0.000010 - momentum: 0.000000 2023-10-14 22:15:06,529 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:15:06,530 EPOCH 7 done: loss 0.0198 - lr: 0.000010 2023-10-14 22:15:13,093 DEV : loss 0.3545047342777252 - f1-score (micro avg) 0.6418 2023-10-14 22:15:13,126 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:15:30,580 epoch 8 - iter 361/3617 - loss 0.01017045 - time (sec): 17.45 - samples/sec: 2112.81 - lr: 0.000010 - momentum: 0.000000 2023-10-14 22:15:47,501 epoch 8 - iter 722/3617 - loss 0.01126589 - time (sec): 34.37 - samples/sec: 2200.48 - lr: 0.000009 - momentum: 0.000000 2023-10-14 22:16:04,038 epoch 8 - iter 1083/3617 - loss 0.01307129 - time (sec): 50.91 - samples/sec: 2233.78 - lr: 0.000009 - momentum: 0.000000 2023-10-14 22:16:20,600 epoch 8 - iter 1444/3617 - loss 0.01282998 - time (sec): 67.47 - samples/sec: 2244.92 - lr: 0.000009 - momentum: 0.000000 2023-10-14 22:16:37,101 epoch 8 - iter 1805/3617 - loss 0.01232805 - time (sec): 83.97 - samples/sec: 2244.87 - lr: 0.000008 - momentum: 0.000000 2023-10-14 22:16:53,501 epoch 8 - iter 2166/3617 - loss 0.01310305 - time (sec): 100.37 - samples/sec: 2262.49 - lr: 0.000008 - momentum: 0.000000 2023-10-14 22:17:09,729 epoch 8 - iter 2527/3617 - loss 0.01331872 - time (sec): 116.60 - samples/sec: 2279.94 - lr: 0.000008 - momentum: 0.000000 2023-10-14 22:17:25,984 epoch 8 - iter 2888/3617 - loss 0.01411043 - time (sec): 132.86 - samples/sec: 2281.65 - lr: 0.000007 - momentum: 0.000000 2023-10-14 22:17:41,747 epoch 8 - iter 3249/3617 - loss 0.01398398 - time (sec): 148.62 - samples/sec: 2291.33 - lr: 0.000007 - momentum: 0.000000 2023-10-14 22:17:57,730 epoch 8 - iter 3610/3617 - loss 0.01442377 - time (sec): 164.60 - samples/sec: 2305.18 - lr: 0.000007 - momentum: 0.000000 2023-10-14 22:17:58,039 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:17:58,039 EPOCH 8 done: loss 0.0144 - lr: 0.000007 2023-10-14 22:18:04,449 DEV : loss 0.367567777633667 - f1-score (micro avg) 0.653 2023-10-14 22:18:04,480 saving best model 2023-10-14 22:18:04,985 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:18:21,415 epoch 9 - iter 361/3617 - loss 0.00607067 - time (sec): 16.43 - samples/sec: 2327.48 - lr: 0.000006 - momentum: 0.000000 2023-10-14 22:18:37,591 epoch 9 - iter 722/3617 - loss 0.00553435 - time (sec): 32.60 - samples/sec: 2324.37 - lr: 0.000006 - momentum: 0.000000 2023-10-14 22:18:53,803 epoch 9 - iter 1083/3617 - loss 0.00695481 - time (sec): 48.81 - samples/sec: 2326.73 - lr: 0.000006 - momentum: 0.000000 2023-10-14 22:19:10,128 epoch 9 - iter 1444/3617 - loss 0.00740150 - time (sec): 65.14 - samples/sec: 2339.99 - lr: 0.000005 - momentum: 0.000000 2023-10-14 22:19:26,337 epoch 9 - iter 1805/3617 - loss 0.00819984 - time (sec): 81.35 - samples/sec: 2342.15 - lr: 0.000005 - momentum: 0.000000 2023-10-14 22:19:42,694 epoch 9 - iter 2166/3617 - loss 0.00848387 - time (sec): 97.70 - samples/sec: 2352.83 - lr: 0.000005 - momentum: 0.000000 2023-10-14 22:19:58,978 epoch 9 - iter 2527/3617 - loss 0.00854654 - time (sec): 113.99 - samples/sec: 2347.77 - lr: 0.000004 - momentum: 0.000000 2023-10-14 22:20:15,184 epoch 9 - iter 2888/3617 - loss 0.00817439 - time (sec): 130.19 - samples/sec: 2338.55 - lr: 0.000004 - momentum: 0.000000 2023-10-14 22:20:31,446 epoch 9 - iter 3249/3617 - loss 0.00797782 - time (sec): 146.46 - samples/sec: 2339.00 - lr: 0.000004 - momentum: 0.000000 2023-10-14 22:20:47,549 epoch 9 - iter 3610/3617 - loss 0.00806657 - time (sec): 162.56 - samples/sec: 2333.16 - lr: 0.000003 - momentum: 0.000000 2023-10-14 22:20:47,856 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:20:47,856 EPOCH 9 done: loss 0.0081 - lr: 0.000003 2023-10-14 22:20:54,174 DEV : loss 0.39251258969306946 - f1-score (micro avg) 0.6402 2023-10-14 22:20:54,203 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:21:10,540 epoch 10 - iter 361/3617 - loss 0.00300548 - time (sec): 16.34 - samples/sec: 2337.28 - lr: 0.000003 - momentum: 0.000000 2023-10-14 22:21:26,713 epoch 10 - iter 722/3617 - loss 0.00462564 - time (sec): 32.51 - samples/sec: 2349.14 - lr: 0.000003 - momentum: 0.000000 2023-10-14 22:21:42,957 epoch 10 - iter 1083/3617 - loss 0.00462801 - time (sec): 48.75 - samples/sec: 2330.43 - lr: 0.000002 - momentum: 0.000000 2023-10-14 22:21:59,261 epoch 10 - iter 1444/3617 - loss 0.00469015 - time (sec): 65.06 - samples/sec: 2336.89 - lr: 0.000002 - momentum: 0.000000 2023-10-14 22:22:15,731 epoch 10 - iter 1805/3617 - loss 0.00462676 - time (sec): 81.53 - samples/sec: 2334.17 - lr: 0.000002 - momentum: 0.000000 2023-10-14 22:22:32,003 epoch 10 - iter 2166/3617 - loss 0.00461601 - time (sec): 97.80 - samples/sec: 2329.77 - lr: 0.000001 - momentum: 0.000000 2023-10-14 22:22:48,206 epoch 10 - iter 2527/3617 - loss 0.00485498 - time (sec): 114.00 - samples/sec: 2329.35 - lr: 0.000001 - momentum: 0.000000 2023-10-14 22:23:04,501 epoch 10 - iter 2888/3617 - loss 0.00441201 - time (sec): 130.30 - samples/sec: 2334.21 - lr: 0.000001 - momentum: 0.000000 2023-10-14 22:23:20,612 epoch 10 - iter 3249/3617 - loss 0.00517285 - time (sec): 146.41 - samples/sec: 2321.40 - lr: 0.000000 - momentum: 0.000000 2023-10-14 22:23:36,898 epoch 10 - iter 3610/3617 - loss 0.00509801 - time (sec): 162.69 - samples/sec: 2332.40 - lr: 0.000000 - momentum: 0.000000 2023-10-14 22:23:37,189 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:23:37,189 EPOCH 10 done: loss 0.0051 - lr: 0.000000 2023-10-14 22:23:42,818 DEV : loss 0.400580495595932 - f1-score (micro avg) 0.6444 2023-10-14 22:23:43,200 ---------------------------------------------------------------------------------------------------- 2023-10-14 22:23:43,201 Loading model from best epoch ... 2023-10-14 22:23:45,437 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org 2023-10-14 22:23:52,349 Results: - F-score (micro) 0.6435 - F-score (macro) 0.5103 - Accuracy 0.4894 By class: precision recall f1-score support loc 0.6322 0.7851 0.7004 591 pers 0.5625 0.7311 0.6358 357 org 0.2000 0.1899 0.1948 79 micro avg 0.5813 0.7205 0.6435 1027 macro avg 0.4649 0.5687 0.5103 1027 weighted avg 0.5747 0.7205 0.6390 1027 2023-10-14 22:23:52,349 ----------------------------------------------------------------------------------------------------