stefan-it's picture
Upload folder using huggingface_hub
07f6bbc
2023-10-14 20:15:14,574 ----------------------------------------------------------------------------------------------------
2023-10-14 20:15:14,575 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 20:15:14,575 ----------------------------------------------------------------------------------------------------
2023-10-14 20:15:14,575 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
2023-10-14 20:15:14,575 ----------------------------------------------------------------------------------------------------
2023-10-14 20:15:14,575 Train: 14465 sentences
2023-10-14 20:15:14,576 (train_with_dev=False, train_with_test=False)
2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
2023-10-14 20:15:14,576 Training Params:
2023-10-14 20:15:14,576 - learning_rate: "3e-05"
2023-10-14 20:15:14,576 - mini_batch_size: "4"
2023-10-14 20:15:14,576 - max_epochs: "10"
2023-10-14 20:15:14,576 - shuffle: "True"
2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
2023-10-14 20:15:14,576 Plugins:
2023-10-14 20:15:14,576 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
2023-10-14 20:15:14,576 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 20:15:14,576 - metric: "('micro avg', 'f1-score')"
2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
2023-10-14 20:15:14,576 Computation:
2023-10-14 20:15:14,576 - compute on device: cuda:0
2023-10-14 20:15:14,576 - embedding storage: none
2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
2023-10-14 20:15:14,576 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
2023-10-14 20:15:30,762 epoch 1 - iter 361/3617 - loss 1.47398781 - time (sec): 16.18 - samples/sec: 2325.55 - lr: 0.000003 - momentum: 0.000000
2023-10-14 20:15:47,116 epoch 1 - iter 722/3617 - loss 0.83970179 - time (sec): 32.54 - samples/sec: 2319.29 - lr: 0.000006 - momentum: 0.000000
2023-10-14 20:16:03,219 epoch 1 - iter 1083/3617 - loss 0.62059022 - time (sec): 48.64 - samples/sec: 2290.84 - lr: 0.000009 - momentum: 0.000000
2023-10-14 20:16:19,390 epoch 1 - iter 1444/3617 - loss 0.50223969 - time (sec): 64.81 - samples/sec: 2291.76 - lr: 0.000012 - momentum: 0.000000
2023-10-14 20:16:35,953 epoch 1 - iter 1805/3617 - loss 0.42436958 - time (sec): 81.38 - samples/sec: 2313.82 - lr: 0.000015 - momentum: 0.000000
2023-10-14 20:16:52,015 epoch 1 - iter 2166/3617 - loss 0.37437638 - time (sec): 97.44 - samples/sec: 2319.20 - lr: 0.000018 - momentum: 0.000000
2023-10-14 20:17:07,780 epoch 1 - iter 2527/3617 - loss 0.33741262 - time (sec): 113.20 - samples/sec: 2348.50 - lr: 0.000021 - momentum: 0.000000
2023-10-14 20:17:23,796 epoch 1 - iter 2888/3617 - loss 0.31042790 - time (sec): 129.22 - samples/sec: 2349.52 - lr: 0.000024 - momentum: 0.000000
2023-10-14 20:17:39,981 epoch 1 - iter 3249/3617 - loss 0.28794148 - time (sec): 145.40 - samples/sec: 2345.55 - lr: 0.000027 - momentum: 0.000000
2023-10-14 20:17:56,334 epoch 1 - iter 3610/3617 - loss 0.27061320 - time (sec): 161.76 - samples/sec: 2344.14 - lr: 0.000030 - momentum: 0.000000
2023-10-14 20:17:56,632 ----------------------------------------------------------------------------------------------------
2023-10-14 20:17:56,633 EPOCH 1 done: loss 0.2704 - lr: 0.000030
2023-10-14 20:18:02,040 DEV : loss 0.11999412626028061 - f1-score (micro avg) 0.6234
2023-10-14 20:18:02,080 saving best model
2023-10-14 20:18:02,475 ----------------------------------------------------------------------------------------------------
2023-10-14 20:18:21,558 epoch 2 - iter 361/3617 - loss 0.09782984 - time (sec): 19.08 - samples/sec: 2009.32 - lr: 0.000030 - momentum: 0.000000
2023-10-14 20:18:39,215 epoch 2 - iter 722/3617 - loss 0.09421131 - time (sec): 36.74 - samples/sec: 2056.11 - lr: 0.000029 - momentum: 0.000000
2023-10-14 20:18:55,988 epoch 2 - iter 1083/3617 - loss 0.09577052 - time (sec): 53.51 - samples/sec: 2106.61 - lr: 0.000029 - momentum: 0.000000
2023-10-14 20:19:13,211 epoch 2 - iter 1444/3617 - loss 0.09595965 - time (sec): 70.73 - samples/sec: 2158.23 - lr: 0.000029 - momentum: 0.000000
2023-10-14 20:19:29,358 epoch 2 - iter 1805/3617 - loss 0.09590618 - time (sec): 86.88 - samples/sec: 2185.24 - lr: 0.000028 - momentum: 0.000000
2023-10-14 20:19:46,839 epoch 2 - iter 2166/3617 - loss 0.09403509 - time (sec): 104.36 - samples/sec: 2192.52 - lr: 0.000028 - momentum: 0.000000
2023-10-14 20:20:03,705 epoch 2 - iter 2527/3617 - loss 0.09439687 - time (sec): 121.23 - samples/sec: 2211.93 - lr: 0.000028 - momentum: 0.000000
2023-10-14 20:20:20,276 epoch 2 - iter 2888/3617 - loss 0.09538483 - time (sec): 137.80 - samples/sec: 2216.77 - lr: 0.000027 - momentum: 0.000000
2023-10-14 20:20:36,508 epoch 2 - iter 3249/3617 - loss 0.09475508 - time (sec): 154.03 - samples/sec: 2215.51 - lr: 0.000027 - momentum: 0.000000
2023-10-14 20:20:55,584 epoch 2 - iter 3610/3617 - loss 0.09527647 - time (sec): 173.11 - samples/sec: 2191.11 - lr: 0.000027 - momentum: 0.000000
2023-10-14 20:20:55,956 ----------------------------------------------------------------------------------------------------
2023-10-14 20:20:55,956 EPOCH 2 done: loss 0.0953 - lr: 0.000027
2023-10-14 20:21:02,854 DEV : loss 0.12750780582427979 - f1-score (micro avg) 0.6294
2023-10-14 20:21:02,888 saving best model
2023-10-14 20:21:03,598 ----------------------------------------------------------------------------------------------------
2023-10-14 20:21:22,467 epoch 3 - iter 361/3617 - loss 0.05413118 - time (sec): 18.87 - samples/sec: 1959.23 - lr: 0.000026 - momentum: 0.000000
2023-10-14 20:21:41,460 epoch 3 - iter 722/3617 - loss 0.06427074 - time (sec): 37.86 - samples/sec: 1973.69 - lr: 0.000026 - momentum: 0.000000
2023-10-14 20:21:57,897 epoch 3 - iter 1083/3617 - loss 0.07334315 - time (sec): 54.30 - samples/sec: 2076.57 - lr: 0.000026 - momentum: 0.000000
2023-10-14 20:22:14,087 epoch 3 - iter 1444/3617 - loss 0.07359621 - time (sec): 70.49 - samples/sec: 2131.15 - lr: 0.000025 - momentum: 0.000000
2023-10-14 20:22:30,386 epoch 3 - iter 1805/3617 - loss 0.07222582 - time (sec): 86.79 - samples/sec: 2165.52 - lr: 0.000025 - momentum: 0.000000
2023-10-14 20:22:46,657 epoch 3 - iter 2166/3617 - loss 0.07167594 - time (sec): 103.06 - samples/sec: 2195.87 - lr: 0.000025 - momentum: 0.000000
2023-10-14 20:23:03,231 epoch 3 - iter 2527/3617 - loss 0.07168900 - time (sec): 119.63 - samples/sec: 2219.22 - lr: 0.000024 - momentum: 0.000000
2023-10-14 20:23:19,692 epoch 3 - iter 2888/3617 - loss 0.07149544 - time (sec): 136.09 - samples/sec: 2230.31 - lr: 0.000024 - momentum: 0.000000
2023-10-14 20:23:36,017 epoch 3 - iter 3249/3617 - loss 0.07253118 - time (sec): 152.42 - samples/sec: 2240.40 - lr: 0.000024 - momentum: 0.000000
2023-10-14 20:23:52,572 epoch 3 - iter 3610/3617 - loss 0.07273065 - time (sec): 168.97 - samples/sec: 2244.48 - lr: 0.000023 - momentum: 0.000000
2023-10-14 20:23:52,880 ----------------------------------------------------------------------------------------------------
2023-10-14 20:23:52,880 EPOCH 3 done: loss 0.0727 - lr: 0.000023
2023-10-14 20:23:59,342 DEV : loss 0.23288682103157043 - f1-score (micro avg) 0.6258
2023-10-14 20:23:59,373 ----------------------------------------------------------------------------------------------------
2023-10-14 20:24:15,633 epoch 4 - iter 361/3617 - loss 0.05027835 - time (sec): 16.26 - samples/sec: 2260.02 - lr: 0.000023 - momentum: 0.000000
2023-10-14 20:24:32,140 epoch 4 - iter 722/3617 - loss 0.05104940 - time (sec): 32.77 - samples/sec: 2293.15 - lr: 0.000023 - momentum: 0.000000
2023-10-14 20:24:48,509 epoch 4 - iter 1083/3617 - loss 0.04972765 - time (sec): 49.13 - samples/sec: 2297.62 - lr: 0.000022 - momentum: 0.000000
2023-10-14 20:25:04,840 epoch 4 - iter 1444/3617 - loss 0.04948816 - time (sec): 65.47 - samples/sec: 2300.32 - lr: 0.000022 - momentum: 0.000000
2023-10-14 20:25:21,353 epoch 4 - iter 1805/3617 - loss 0.04959231 - time (sec): 81.98 - samples/sec: 2316.09 - lr: 0.000022 - momentum: 0.000000
2023-10-14 20:25:37,665 epoch 4 - iter 2166/3617 - loss 0.05039825 - time (sec): 98.29 - samples/sec: 2325.84 - lr: 0.000021 - momentum: 0.000000
2023-10-14 20:25:53,888 epoch 4 - iter 2527/3617 - loss 0.05098071 - time (sec): 114.51 - samples/sec: 2324.83 - lr: 0.000021 - momentum: 0.000000
2023-10-14 20:26:09,988 epoch 4 - iter 2888/3617 - loss 0.05275888 - time (sec): 130.61 - samples/sec: 2333.25 - lr: 0.000021 - momentum: 0.000000
2023-10-14 20:26:26,056 epoch 4 - iter 3249/3617 - loss 0.05280906 - time (sec): 146.68 - samples/sec: 2333.90 - lr: 0.000020 - momentum: 0.000000
2023-10-14 20:26:42,216 epoch 4 - iter 3610/3617 - loss 0.05251226 - time (sec): 162.84 - samples/sec: 2328.47 - lr: 0.000020 - momentum: 0.000000
2023-10-14 20:26:42,519 ----------------------------------------------------------------------------------------------------
2023-10-14 20:26:42,519 EPOCH 4 done: loss 0.0524 - lr: 0.000020
2023-10-14 20:26:48,265 DEV : loss 0.29611918330192566 - f1-score (micro avg) 0.6115
2023-10-14 20:26:48,298 ----------------------------------------------------------------------------------------------------
2023-10-14 20:27:05,568 epoch 5 - iter 361/3617 - loss 0.04203166 - time (sec): 17.27 - samples/sec: 2138.47 - lr: 0.000020 - momentum: 0.000000
2023-10-14 20:27:21,925 epoch 5 - iter 722/3617 - loss 0.03700812 - time (sec): 33.63 - samples/sec: 2268.87 - lr: 0.000019 - momentum: 0.000000
2023-10-14 20:27:38,355 epoch 5 - iter 1083/3617 - loss 0.03554194 - time (sec): 50.06 - samples/sec: 2282.52 - lr: 0.000019 - momentum: 0.000000
2023-10-14 20:27:54,786 epoch 5 - iter 1444/3617 - loss 0.03570610 - time (sec): 66.49 - samples/sec: 2278.40 - lr: 0.000019 - momentum: 0.000000
2023-10-14 20:28:11,254 epoch 5 - iter 1805/3617 - loss 0.03486095 - time (sec): 82.95 - samples/sec: 2272.79 - lr: 0.000018 - momentum: 0.000000
2023-10-14 20:28:27,973 epoch 5 - iter 2166/3617 - loss 0.03550990 - time (sec): 99.67 - samples/sec: 2291.81 - lr: 0.000018 - momentum: 0.000000
2023-10-14 20:28:44,336 epoch 5 - iter 2527/3617 - loss 0.03627469 - time (sec): 116.04 - samples/sec: 2296.38 - lr: 0.000018 - momentum: 0.000000
2023-10-14 20:29:00,528 epoch 5 - iter 2888/3617 - loss 0.03591614 - time (sec): 132.23 - samples/sec: 2303.16 - lr: 0.000017 - momentum: 0.000000
2023-10-14 20:29:16,672 epoch 5 - iter 3249/3617 - loss 0.03711630 - time (sec): 148.37 - samples/sec: 2304.33 - lr: 0.000017 - momentum: 0.000000
2023-10-14 20:29:32,867 epoch 5 - iter 3610/3617 - loss 0.03676579 - time (sec): 164.57 - samples/sec: 2303.59 - lr: 0.000017 - momentum: 0.000000
2023-10-14 20:29:33,174 ----------------------------------------------------------------------------------------------------
2023-10-14 20:29:33,174 EPOCH 5 done: loss 0.0367 - lr: 0.000017
2023-10-14 20:29:38,931 DEV : loss 0.30095481872558594 - f1-score (micro avg) 0.6255
2023-10-14 20:29:38,964 ----------------------------------------------------------------------------------------------------
2023-10-14 20:29:55,577 epoch 6 - iter 361/3617 - loss 0.02879097 - time (sec): 16.61 - samples/sec: 2327.19 - lr: 0.000016 - momentum: 0.000000
2023-10-14 20:30:11,985 epoch 6 - iter 722/3617 - loss 0.02283689 - time (sec): 33.02 - samples/sec: 2300.73 - lr: 0.000016 - momentum: 0.000000
2023-10-14 20:30:28,425 epoch 6 - iter 1083/3617 - loss 0.02449134 - time (sec): 49.46 - samples/sec: 2308.91 - lr: 0.000016 - momentum: 0.000000
2023-10-14 20:30:44,812 epoch 6 - iter 1444/3617 - loss 0.02459281 - time (sec): 65.85 - samples/sec: 2288.93 - lr: 0.000015 - momentum: 0.000000
2023-10-14 20:31:01,172 epoch 6 - iter 1805/3617 - loss 0.02646233 - time (sec): 82.21 - samples/sec: 2285.13 - lr: 0.000015 - momentum: 0.000000
2023-10-14 20:31:17,640 epoch 6 - iter 2166/3617 - loss 0.02632674 - time (sec): 98.67 - samples/sec: 2283.10 - lr: 0.000015 - momentum: 0.000000
2023-10-14 20:31:34,002 epoch 6 - iter 2527/3617 - loss 0.02565887 - time (sec): 115.04 - samples/sec: 2284.73 - lr: 0.000014 - momentum: 0.000000
2023-10-14 20:31:50,346 epoch 6 - iter 2888/3617 - loss 0.02507191 - time (sec): 131.38 - samples/sec: 2294.77 - lr: 0.000014 - momentum: 0.000000
2023-10-14 20:32:06,871 epoch 6 - iter 3249/3617 - loss 0.02538127 - time (sec): 147.91 - samples/sec: 2298.40 - lr: 0.000014 - momentum: 0.000000
2023-10-14 20:32:23,470 epoch 6 - iter 3610/3617 - loss 0.02548542 - time (sec): 164.51 - samples/sec: 2305.67 - lr: 0.000013 - momentum: 0.000000
2023-10-14 20:32:23,772 ----------------------------------------------------------------------------------------------------
2023-10-14 20:32:23,772 EPOCH 6 done: loss 0.0254 - lr: 0.000013
2023-10-14 20:32:31,112 DEV : loss 0.35236480832099915 - f1-score (micro avg) 0.6282
2023-10-14 20:32:31,150 ----------------------------------------------------------------------------------------------------
2023-10-14 20:32:48,700 epoch 7 - iter 361/3617 - loss 0.01714858 - time (sec): 17.55 - samples/sec: 2204.32 - lr: 0.000013 - momentum: 0.000000
2023-10-14 20:33:05,924 epoch 7 - iter 722/3617 - loss 0.01781415 - time (sec): 34.77 - samples/sec: 2200.52 - lr: 0.000013 - momentum: 0.000000
2023-10-14 20:33:22,908 epoch 7 - iter 1083/3617 - loss 0.01660375 - time (sec): 51.76 - samples/sec: 2206.15 - lr: 0.000012 - momentum: 0.000000
2023-10-14 20:33:38,636 epoch 7 - iter 1444/3617 - loss 0.01611126 - time (sec): 67.49 - samples/sec: 2258.38 - lr: 0.000012 - momentum: 0.000000
2023-10-14 20:33:55,037 epoch 7 - iter 1805/3617 - loss 0.01766198 - time (sec): 83.89 - samples/sec: 2270.09 - lr: 0.000012 - momentum: 0.000000
2023-10-14 20:34:11,293 epoch 7 - iter 2166/3617 - loss 0.01772828 - time (sec): 100.14 - samples/sec: 2272.39 - lr: 0.000011 - momentum: 0.000000
2023-10-14 20:34:27,657 epoch 7 - iter 2527/3617 - loss 0.01786773 - time (sec): 116.51 - samples/sec: 2275.25 - lr: 0.000011 - momentum: 0.000000
2023-10-14 20:34:44,293 epoch 7 - iter 2888/3617 - loss 0.01874021 - time (sec): 133.14 - samples/sec: 2289.48 - lr: 0.000011 - momentum: 0.000000
2023-10-14 20:35:00,684 epoch 7 - iter 3249/3617 - loss 0.01801121 - time (sec): 149.53 - samples/sec: 2286.88 - lr: 0.000010 - momentum: 0.000000
2023-10-14 20:35:17,050 epoch 7 - iter 3610/3617 - loss 0.01800545 - time (sec): 165.90 - samples/sec: 2287.35 - lr: 0.000010 - momentum: 0.000000
2023-10-14 20:35:17,351 ----------------------------------------------------------------------------------------------------
2023-10-14 20:35:17,352 EPOCH 7 done: loss 0.0180 - lr: 0.000010
2023-10-14 20:35:23,896 DEV : loss 0.3972169756889343 - f1-score (micro avg) 0.6267
2023-10-14 20:35:23,934 ----------------------------------------------------------------------------------------------------
2023-10-14 20:35:42,130 epoch 8 - iter 361/3617 - loss 0.00969253 - time (sec): 18.19 - samples/sec: 2088.14 - lr: 0.000010 - momentum: 0.000000
2023-10-14 20:35:58,861 epoch 8 - iter 722/3617 - loss 0.00894942 - time (sec): 34.93 - samples/sec: 2194.36 - lr: 0.000009 - momentum: 0.000000
2023-10-14 20:36:15,061 epoch 8 - iter 1083/3617 - loss 0.00849593 - time (sec): 51.12 - samples/sec: 2217.24 - lr: 0.000009 - momentum: 0.000000
2023-10-14 20:36:30,777 epoch 8 - iter 1444/3617 - loss 0.00914655 - time (sec): 66.84 - samples/sec: 2270.51 - lr: 0.000009 - momentum: 0.000000
2023-10-14 20:36:46,752 epoch 8 - iter 1805/3617 - loss 0.00874731 - time (sec): 82.82 - samples/sec: 2299.67 - lr: 0.000008 - momentum: 0.000000
2023-10-14 20:37:03,211 epoch 8 - iter 2166/3617 - loss 0.00988221 - time (sec): 99.28 - samples/sec: 2296.48 - lr: 0.000008 - momentum: 0.000000
2023-10-14 20:37:19,499 epoch 8 - iter 2527/3617 - loss 0.01064127 - time (sec): 115.56 - samples/sec: 2299.50 - lr: 0.000008 - momentum: 0.000000
2023-10-14 20:37:35,851 epoch 8 - iter 2888/3617 - loss 0.01050074 - time (sec): 131.92 - samples/sec: 2306.37 - lr: 0.000007 - momentum: 0.000000
2023-10-14 20:37:52,088 epoch 8 - iter 3249/3617 - loss 0.01062696 - time (sec): 148.15 - samples/sec: 2305.83 - lr: 0.000007 - momentum: 0.000000
2023-10-14 20:38:08,575 epoch 8 - iter 3610/3617 - loss 0.01078664 - time (sec): 164.64 - samples/sec: 2304.14 - lr: 0.000007 - momentum: 0.000000
2023-10-14 20:38:08,880 ----------------------------------------------------------------------------------------------------
2023-10-14 20:38:08,880 EPOCH 8 done: loss 0.0108 - lr: 0.000007
2023-10-14 20:38:16,090 DEV : loss 0.41721734404563904 - f1-score (micro avg) 0.6318
2023-10-14 20:38:16,127 saving best model
2023-10-14 20:38:16,663 ----------------------------------------------------------------------------------------------------
2023-10-14 20:38:35,739 epoch 9 - iter 361/3617 - loss 0.01118969 - time (sec): 19.07 - samples/sec: 1994.33 - lr: 0.000006 - momentum: 0.000000
2023-10-14 20:38:54,004 epoch 9 - iter 722/3617 - loss 0.00841180 - time (sec): 37.34 - samples/sec: 2045.19 - lr: 0.000006 - momentum: 0.000000
2023-10-14 20:39:10,764 epoch 9 - iter 1083/3617 - loss 0.00775291 - time (sec): 54.10 - samples/sec: 2142.25 - lr: 0.000006 - momentum: 0.000000
2023-10-14 20:39:27,106 epoch 9 - iter 1444/3617 - loss 0.00707364 - time (sec): 70.44 - samples/sec: 2169.75 - lr: 0.000005 - momentum: 0.000000
2023-10-14 20:39:43,532 epoch 9 - iter 1805/3617 - loss 0.00759031 - time (sec): 86.87 - samples/sec: 2184.85 - lr: 0.000005 - momentum: 0.000000
2023-10-14 20:39:59,860 epoch 9 - iter 2166/3617 - loss 0.00773152 - time (sec): 103.19 - samples/sec: 2200.69 - lr: 0.000005 - momentum: 0.000000
2023-10-14 20:40:16,264 epoch 9 - iter 2527/3617 - loss 0.00812156 - time (sec): 119.60 - samples/sec: 2214.86 - lr: 0.000004 - momentum: 0.000000
2023-10-14 20:40:32,665 epoch 9 - iter 2888/3617 - loss 0.00788977 - time (sec): 136.00 - samples/sec: 2226.94 - lr: 0.000004 - momentum: 0.000000
2023-10-14 20:40:49,193 epoch 9 - iter 3249/3617 - loss 0.00758141 - time (sec): 152.53 - samples/sec: 2235.85 - lr: 0.000004 - momentum: 0.000000
2023-10-14 20:41:05,613 epoch 9 - iter 3610/3617 - loss 0.00754676 - time (sec): 168.95 - samples/sec: 2245.32 - lr: 0.000003 - momentum: 0.000000
2023-10-14 20:41:05,923 ----------------------------------------------------------------------------------------------------
2023-10-14 20:41:05,923 EPOCH 9 done: loss 0.0075 - lr: 0.000003
2023-10-14 20:41:11,659 DEV : loss 0.4234275221824646 - f1-score (micro avg) 0.6324
2023-10-14 20:41:11,697 saving best model
2023-10-14 20:41:12,297 ----------------------------------------------------------------------------------------------------
2023-10-14 20:41:31,527 epoch 10 - iter 361/3617 - loss 0.00680316 - time (sec): 19.23 - samples/sec: 1971.51 - lr: 0.000003 - momentum: 0.000000
2023-10-14 20:41:50,552 epoch 10 - iter 722/3617 - loss 0.00602000 - time (sec): 38.25 - samples/sec: 1971.41 - lr: 0.000003 - momentum: 0.000000
2023-10-14 20:42:08,808 epoch 10 - iter 1083/3617 - loss 0.00486451 - time (sec): 56.51 - samples/sec: 2016.08 - lr: 0.000002 - momentum: 0.000000
2023-10-14 20:42:25,580 epoch 10 - iter 1444/3617 - loss 0.00458704 - time (sec): 73.28 - samples/sec: 2060.75 - lr: 0.000002 - momentum: 0.000000
2023-10-14 20:42:41,997 epoch 10 - iter 1805/3617 - loss 0.00443742 - time (sec): 89.70 - samples/sec: 2113.84 - lr: 0.000002 - momentum: 0.000000
2023-10-14 20:42:58,053 epoch 10 - iter 2166/3617 - loss 0.00538436 - time (sec): 105.75 - samples/sec: 2138.96 - lr: 0.000001 - momentum: 0.000000
2023-10-14 20:43:14,025 epoch 10 - iter 2527/3617 - loss 0.00544432 - time (sec): 121.72 - samples/sec: 2168.45 - lr: 0.000001 - momentum: 0.000000
2023-10-14 20:43:30,436 epoch 10 - iter 2888/3617 - loss 0.00544999 - time (sec): 138.14 - samples/sec: 2196.22 - lr: 0.000001 - momentum: 0.000000
2023-10-14 20:43:46,505 epoch 10 - iter 3249/3617 - loss 0.00524360 - time (sec): 154.20 - samples/sec: 2213.00 - lr: 0.000000 - momentum: 0.000000
2023-10-14 20:44:02,713 epoch 10 - iter 3610/3617 - loss 0.00547411 - time (sec): 170.41 - samples/sec: 2225.38 - lr: 0.000000 - momentum: 0.000000
2023-10-14 20:44:03,016 ----------------------------------------------------------------------------------------------------
2023-10-14 20:44:03,016 EPOCH 10 done: loss 0.0055 - lr: 0.000000
2023-10-14 20:44:08,786 DEV : loss 0.43111804127693176 - f1-score (micro avg) 0.6361
2023-10-14 20:44:08,840 saving best model
2023-10-14 20:44:09,784 ----------------------------------------------------------------------------------------------------
2023-10-14 20:44:09,785 Loading model from best epoch ...
2023-10-14 20:44:11,351 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
2023-10-14 20:44:20,229
Results:
- F-score (micro) 0.6471
- F-score (macro) 0.5082
- Accuracy 0.4927
By class:
precision recall f1-score support
loc 0.6190 0.7834 0.6916 591
pers 0.5736 0.7423 0.6471 357
org 0.2400 0.1519 0.1860 79
micro avg 0.5873 0.7205 0.6471 1027
macro avg 0.4775 0.5592 0.5082 1027
weighted avg 0.5741 0.7205 0.6372 1027
2023-10-14 20:44:20,230 ----------------------------------------------------------------------------------------------------