flair-hipe-2022-ajmc-fr / training.log
stefan-it's picture
Upload folder using huggingface_hub
404d985
2023-10-17 10:52:13,216 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:13,217 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 10:52:13,217 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:13,217 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-17 10:52:13,218 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:13,218 Train: 966 sentences
2023-10-17 10:52:13,218 (train_with_dev=False, train_with_test=False)
2023-10-17 10:52:13,218 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:13,218 Training Params:
2023-10-17 10:52:13,218 - learning_rate: "3e-05"
2023-10-17 10:52:13,218 - mini_batch_size: "4"
2023-10-17 10:52:13,218 - max_epochs: "10"
2023-10-17 10:52:13,218 - shuffle: "True"
2023-10-17 10:52:13,218 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:13,218 Plugins:
2023-10-17 10:52:13,218 - TensorboardLogger
2023-10-17 10:52:13,218 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 10:52:13,218 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:13,218 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 10:52:13,218 - metric: "('micro avg', 'f1-score')"
2023-10-17 10:52:13,218 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:13,218 Computation:
2023-10-17 10:52:13,218 - compute on device: cuda:0
2023-10-17 10:52:13,218 - embedding storage: none
2023-10-17 10:52:13,218 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:13,218 Model training base path: "hmbench-ajmc/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 10:52:13,218 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:13,218 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:13,218 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 10:52:14,319 epoch 1 - iter 24/242 - loss 4.05561927 - time (sec): 1.10 - samples/sec: 2190.88 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:52:15,412 epoch 1 - iter 48/242 - loss 3.46627133 - time (sec): 2.19 - samples/sec: 2166.57 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:52:16,512 epoch 1 - iter 72/242 - loss 2.62777849 - time (sec): 3.29 - samples/sec: 2147.87 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:52:17,621 epoch 1 - iter 96/242 - loss 2.06961410 - time (sec): 4.40 - samples/sec: 2218.01 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:52:18,710 epoch 1 - iter 120/242 - loss 1.78636708 - time (sec): 5.49 - samples/sec: 2194.06 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:52:19,830 epoch 1 - iter 144/242 - loss 1.54146613 - time (sec): 6.61 - samples/sec: 2211.98 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:52:20,933 epoch 1 - iter 168/242 - loss 1.34360051 - time (sec): 7.71 - samples/sec: 2243.84 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:52:22,058 epoch 1 - iter 192/242 - loss 1.19671631 - time (sec): 8.84 - samples/sec: 2276.02 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:52:23,141 epoch 1 - iter 216/242 - loss 1.11363742 - time (sec): 9.92 - samples/sec: 2253.01 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:52:24,233 epoch 1 - iter 240/242 - loss 1.03601016 - time (sec): 11.01 - samples/sec: 2234.83 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:52:24,322 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:24,323 EPOCH 1 done: loss 1.0336 - lr: 0.000030
2023-10-17 10:52:24,897 DEV : loss 0.2042546272277832 - f1-score (micro avg) 0.6459
2023-10-17 10:52:24,902 saving best model
2023-10-17 10:52:25,314 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:26,406 epoch 2 - iter 24/242 - loss 0.18992354 - time (sec): 1.09 - samples/sec: 2103.59 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:52:27,515 epoch 2 - iter 48/242 - loss 0.19235864 - time (sec): 2.20 - samples/sec: 2123.59 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:52:28,633 epoch 2 - iter 72/242 - loss 0.19211539 - time (sec): 3.32 - samples/sec: 2141.29 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:52:29,736 epoch 2 - iter 96/242 - loss 0.18398095 - time (sec): 4.42 - samples/sec: 2152.87 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:52:30,835 epoch 2 - iter 120/242 - loss 0.17679653 - time (sec): 5.52 - samples/sec: 2217.10 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:52:31,925 epoch 2 - iter 144/242 - loss 0.17400339 - time (sec): 6.61 - samples/sec: 2200.06 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:52:33,021 epoch 2 - iter 168/242 - loss 0.17306064 - time (sec): 7.71 - samples/sec: 2224.11 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:52:34,142 epoch 2 - iter 192/242 - loss 0.16742092 - time (sec): 8.83 - samples/sec: 2229.51 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:52:35,248 epoch 2 - iter 216/242 - loss 0.17430464 - time (sec): 9.93 - samples/sec: 2228.03 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:52:36,349 epoch 2 - iter 240/242 - loss 0.17096481 - time (sec): 11.03 - samples/sec: 2226.56 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:52:36,438 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:36,439 EPOCH 2 done: loss 0.1698 - lr: 0.000027
2023-10-17 10:52:37,325 DEV : loss 0.14782120287418365 - f1-score (micro avg) 0.8029
2023-10-17 10:52:37,330 saving best model
2023-10-17 10:52:37,889 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:39,021 epoch 3 - iter 24/242 - loss 0.12982655 - time (sec): 1.13 - samples/sec: 2135.48 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:52:40,123 epoch 3 - iter 48/242 - loss 0.10924199 - time (sec): 2.23 - samples/sec: 2224.54 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:52:41,234 epoch 3 - iter 72/242 - loss 0.09209921 - time (sec): 3.34 - samples/sec: 2302.15 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:52:42,342 epoch 3 - iter 96/242 - loss 0.09709447 - time (sec): 4.45 - samples/sec: 2276.86 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:52:43,444 epoch 3 - iter 120/242 - loss 0.10103426 - time (sec): 5.55 - samples/sec: 2260.26 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:52:44,535 epoch 3 - iter 144/242 - loss 0.10044124 - time (sec): 6.64 - samples/sec: 2260.54 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:52:45,641 epoch 3 - iter 168/242 - loss 0.10471543 - time (sec): 7.75 - samples/sec: 2271.00 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:52:46,742 epoch 3 - iter 192/242 - loss 0.10262288 - time (sec): 8.85 - samples/sec: 2211.03 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:52:47,881 epoch 3 - iter 216/242 - loss 0.10250671 - time (sec): 9.99 - samples/sec: 2233.48 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:52:48,964 epoch 3 - iter 240/242 - loss 0.10517789 - time (sec): 11.07 - samples/sec: 2216.96 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:52:49,053 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:49,053 EPOCH 3 done: loss 0.1058 - lr: 0.000023
2023-10-17 10:52:49,833 DEV : loss 0.16301028430461884 - f1-score (micro avg) 0.8143
2023-10-17 10:52:49,839 saving best model
2023-10-17 10:52:50,402 ----------------------------------------------------------------------------------------------------
2023-10-17 10:52:51,523 epoch 4 - iter 24/242 - loss 0.08820816 - time (sec): 1.12 - samples/sec: 1856.38 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:52:52,629 epoch 4 - iter 48/242 - loss 0.08121919 - time (sec): 2.22 - samples/sec: 2051.20 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:52:53,738 epoch 4 - iter 72/242 - loss 0.07061606 - time (sec): 3.33 - samples/sec: 2089.11 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:52:54,866 epoch 4 - iter 96/242 - loss 0.07100203 - time (sec): 4.46 - samples/sec: 2143.19 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:52:55,982 epoch 4 - iter 120/242 - loss 0.06984215 - time (sec): 5.58 - samples/sec: 2191.72 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:52:57,068 epoch 4 - iter 144/242 - loss 0.07062933 - time (sec): 6.66 - samples/sec: 2162.97 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:52:58,185 epoch 4 - iter 168/242 - loss 0.07553476 - time (sec): 7.78 - samples/sec: 2161.88 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:52:59,303 epoch 4 - iter 192/242 - loss 0.07153261 - time (sec): 8.90 - samples/sec: 2175.30 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:53:00,435 epoch 4 - iter 216/242 - loss 0.07264171 - time (sec): 10.03 - samples/sec: 2184.21 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:53:01,529 epoch 4 - iter 240/242 - loss 0.07099997 - time (sec): 11.12 - samples/sec: 2214.28 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:53:01,620 ----------------------------------------------------------------------------------------------------
2023-10-17 10:53:01,620 EPOCH 4 done: loss 0.0707 - lr: 0.000020
2023-10-17 10:53:02,370 DEV : loss 0.16144302487373352 - f1-score (micro avg) 0.8344
2023-10-17 10:53:02,375 saving best model
2023-10-17 10:53:02,879 ----------------------------------------------------------------------------------------------------
2023-10-17 10:53:04,051 epoch 5 - iter 24/242 - loss 0.03403467 - time (sec): 1.17 - samples/sec: 1861.56 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:53:05,152 epoch 5 - iter 48/242 - loss 0.04342883 - time (sec): 2.27 - samples/sec: 1958.76 - lr: 0.000019 - momentum: 0.000000
2023-10-17 10:53:06,263 epoch 5 - iter 72/242 - loss 0.04288749 - time (sec): 3.38 - samples/sec: 2083.67 - lr: 0.000019 - momentum: 0.000000
2023-10-17 10:53:07,380 epoch 5 - iter 96/242 - loss 0.04581926 - time (sec): 4.50 - samples/sec: 2084.16 - lr: 0.000019 - momentum: 0.000000
2023-10-17 10:53:08,465 epoch 5 - iter 120/242 - loss 0.04481656 - time (sec): 5.58 - samples/sec: 2109.10 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:53:09,619 epoch 5 - iter 144/242 - loss 0.04898354 - time (sec): 6.73 - samples/sec: 2164.60 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:53:10,700 epoch 5 - iter 168/242 - loss 0.05051468 - time (sec): 7.82 - samples/sec: 2156.24 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:53:11,843 epoch 5 - iter 192/242 - loss 0.05227590 - time (sec): 8.96 - samples/sec: 2158.97 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:53:12,960 epoch 5 - iter 216/242 - loss 0.05101012 - time (sec): 10.08 - samples/sec: 2170.68 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:53:14,094 epoch 5 - iter 240/242 - loss 0.05102788 - time (sec): 11.21 - samples/sec: 2199.28 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:53:14,185 ----------------------------------------------------------------------------------------------------
2023-10-17 10:53:14,185 EPOCH 5 done: loss 0.0509 - lr: 0.000017
2023-10-17 10:53:14,978 DEV : loss 0.19747966527938843 - f1-score (micro avg) 0.8398
2023-10-17 10:53:14,985 saving best model
2023-10-17 10:53:15,559 ----------------------------------------------------------------------------------------------------
2023-10-17 10:53:16,884 epoch 6 - iter 24/242 - loss 0.02212071 - time (sec): 1.32 - samples/sec: 1860.89 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:53:18,184 epoch 6 - iter 48/242 - loss 0.03533340 - time (sec): 2.62 - samples/sec: 1891.13 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:53:19,423 epoch 6 - iter 72/242 - loss 0.03639866 - time (sec): 3.86 - samples/sec: 1948.61 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:53:20,713 epoch 6 - iter 96/242 - loss 0.03590078 - time (sec): 5.14 - samples/sec: 1890.09 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:53:22,004 epoch 6 - iter 120/242 - loss 0.03634387 - time (sec): 6.44 - samples/sec: 1897.36 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:53:23,321 epoch 6 - iter 144/242 - loss 0.03756964 - time (sec): 7.75 - samples/sec: 1840.30 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:53:24,657 epoch 6 - iter 168/242 - loss 0.03954105 - time (sec): 9.09 - samples/sec: 1876.93 - lr: 0.000014 - momentum: 0.000000
2023-10-17 10:53:25,838 epoch 6 - iter 192/242 - loss 0.03720847 - time (sec): 10.27 - samples/sec: 1902.58 - lr: 0.000014 - momentum: 0.000000
2023-10-17 10:53:26,967 epoch 6 - iter 216/242 - loss 0.03699606 - time (sec): 11.40 - samples/sec: 1923.91 - lr: 0.000014 - momentum: 0.000000
2023-10-17 10:53:28,108 epoch 6 - iter 240/242 - loss 0.03933757 - time (sec): 12.54 - samples/sec: 1958.90 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:53:28,208 ----------------------------------------------------------------------------------------------------
2023-10-17 10:53:28,209 EPOCH 6 done: loss 0.0391 - lr: 0.000013
2023-10-17 10:53:28,961 DEV : loss 0.20731210708618164 - f1-score (micro avg) 0.8615
2023-10-17 10:53:28,966 saving best model
2023-10-17 10:53:29,514 ----------------------------------------------------------------------------------------------------
2023-10-17 10:53:30,740 epoch 7 - iter 24/242 - loss 0.01339709 - time (sec): 1.22 - samples/sec: 1959.31 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:53:31,952 epoch 7 - iter 48/242 - loss 0.02690876 - time (sec): 2.43 - samples/sec: 1914.80 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:53:33,064 epoch 7 - iter 72/242 - loss 0.02770343 - time (sec): 3.55 - samples/sec: 2039.73 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:53:34,207 epoch 7 - iter 96/242 - loss 0.02733078 - time (sec): 4.69 - samples/sec: 2061.37 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:53:35,326 epoch 7 - iter 120/242 - loss 0.02754959 - time (sec): 5.81 - samples/sec: 2111.21 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:53:36,482 epoch 7 - iter 144/242 - loss 0.02877701 - time (sec): 6.96 - samples/sec: 2093.36 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:53:37,671 epoch 7 - iter 168/242 - loss 0.02886521 - time (sec): 8.15 - samples/sec: 2075.61 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:53:38,783 epoch 7 - iter 192/242 - loss 0.02748314 - time (sec): 9.26 - samples/sec: 2098.79 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:53:39,905 epoch 7 - iter 216/242 - loss 0.02928343 - time (sec): 10.39 - samples/sec: 2122.71 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:53:41,028 epoch 7 - iter 240/242 - loss 0.03044296 - time (sec): 11.51 - samples/sec: 2140.96 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:53:41,130 ----------------------------------------------------------------------------------------------------
2023-10-17 10:53:41,131 EPOCH 7 done: loss 0.0303 - lr: 0.000010
2023-10-17 10:53:41,882 DEV : loss 0.23359926044940948 - f1-score (micro avg) 0.827
2023-10-17 10:53:41,887 ----------------------------------------------------------------------------------------------------
2023-10-17 10:53:43,032 epoch 8 - iter 24/242 - loss 0.03195256 - time (sec): 1.14 - samples/sec: 1874.17 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:53:44,194 epoch 8 - iter 48/242 - loss 0.02723979 - time (sec): 2.31 - samples/sec: 2080.64 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:53:45,318 epoch 8 - iter 72/242 - loss 0.02294677 - time (sec): 3.43 - samples/sec: 2087.71 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:53:46,505 epoch 8 - iter 96/242 - loss 0.02098985 - time (sec): 4.62 - samples/sec: 2170.01 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:53:47,611 epoch 8 - iter 120/242 - loss 0.02039880 - time (sec): 5.72 - samples/sec: 2189.68 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:53:48,689 epoch 8 - iter 144/242 - loss 0.01846555 - time (sec): 6.80 - samples/sec: 2203.37 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:53:49,802 epoch 8 - iter 168/242 - loss 0.02038616 - time (sec): 7.91 - samples/sec: 2169.22 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:53:50,907 epoch 8 - iter 192/242 - loss 0.02095739 - time (sec): 9.02 - samples/sec: 2176.58 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:53:52,020 epoch 8 - iter 216/242 - loss 0.01907598 - time (sec): 10.13 - samples/sec: 2196.44 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:53:53,141 epoch 8 - iter 240/242 - loss 0.02064253 - time (sec): 11.25 - samples/sec: 2187.10 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:53:53,226 ----------------------------------------------------------------------------------------------------
2023-10-17 10:53:53,227 EPOCH 8 done: loss 0.0206 - lr: 0.000007
2023-10-17 10:53:53,975 DEV : loss 0.25143641233444214 - f1-score (micro avg) 0.825
2023-10-17 10:53:53,980 ----------------------------------------------------------------------------------------------------
2023-10-17 10:53:55,135 epoch 9 - iter 24/242 - loss 0.02183672 - time (sec): 1.15 - samples/sec: 2166.99 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:53:56,242 epoch 9 - iter 48/242 - loss 0.01557288 - time (sec): 2.26 - samples/sec: 2123.02 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:53:57,385 epoch 9 - iter 72/242 - loss 0.01416315 - time (sec): 3.40 - samples/sec: 2059.83 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:53:58,528 epoch 9 - iter 96/242 - loss 0.01572072 - time (sec): 4.55 - samples/sec: 2049.01 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:53:59,640 epoch 9 - iter 120/242 - loss 0.01466146 - time (sec): 5.66 - samples/sec: 2044.53 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:54:00,805 epoch 9 - iter 144/242 - loss 0.01491075 - time (sec): 6.82 - samples/sec: 2086.41 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:54:01,942 epoch 9 - iter 168/242 - loss 0.01670932 - time (sec): 7.96 - samples/sec: 2113.15 - lr: 0.000004 - momentum: 0.000000
2023-10-17 10:54:03,059 epoch 9 - iter 192/242 - loss 0.01561535 - time (sec): 9.08 - samples/sec: 2148.82 - lr: 0.000004 - momentum: 0.000000
2023-10-17 10:54:04,252 epoch 9 - iter 216/242 - loss 0.01710124 - time (sec): 10.27 - samples/sec: 2161.63 - lr: 0.000004 - momentum: 0.000000
2023-10-17 10:54:05,365 epoch 9 - iter 240/242 - loss 0.01794107 - time (sec): 11.38 - samples/sec: 2160.84 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:54:05,449 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:05,450 EPOCH 9 done: loss 0.0178 - lr: 0.000003
2023-10-17 10:54:06,201 DEV : loss 0.25199437141418457 - f1-score (micro avg) 0.8229
2023-10-17 10:54:06,206 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:07,293 epoch 10 - iter 24/242 - loss 0.01009403 - time (sec): 1.08 - samples/sec: 2180.89 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:54:08,396 epoch 10 - iter 48/242 - loss 0.01346526 - time (sec): 2.19 - samples/sec: 2086.31 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:54:09,500 epoch 10 - iter 72/242 - loss 0.01523550 - time (sec): 3.29 - samples/sec: 2174.91 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:54:10,618 epoch 10 - iter 96/242 - loss 0.01404734 - time (sec): 4.41 - samples/sec: 2197.20 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:54:11,776 epoch 10 - iter 120/242 - loss 0.01621470 - time (sec): 5.57 - samples/sec: 2283.18 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:54:12,888 epoch 10 - iter 144/242 - loss 0.01390069 - time (sec): 6.68 - samples/sec: 2225.56 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:54:14,010 epoch 10 - iter 168/242 - loss 0.01217239 - time (sec): 7.80 - samples/sec: 2187.64 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:54:15,146 epoch 10 - iter 192/242 - loss 0.01207740 - time (sec): 8.94 - samples/sec: 2204.30 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:54:16,254 epoch 10 - iter 216/242 - loss 0.01231007 - time (sec): 10.05 - samples/sec: 2232.19 - lr: 0.000000 - momentum: 0.000000
2023-10-17 10:54:17,351 epoch 10 - iter 240/242 - loss 0.01201054 - time (sec): 11.14 - samples/sec: 2205.37 - lr: 0.000000 - momentum: 0.000000
2023-10-17 10:54:17,437 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:17,438 EPOCH 10 done: loss 0.0119 - lr: 0.000000
2023-10-17 10:54:18,220 DEV : loss 0.2569812536239624 - f1-score (micro avg) 0.8254
2023-10-17 10:54:18,602 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:18,603 Loading model from best epoch ...
2023-10-17 10:54:19,941 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-17 10:54:20,761
Results:
- F-score (micro) 0.8253
- F-score (macro) 0.5669
- Accuracy 0.7194
By class:
precision recall f1-score support
pers 0.8493 0.8921 0.8702 139
scope 0.8485 0.8682 0.8582 129
work 0.7126 0.7750 0.7425 80
loc 1.0000 0.2222 0.3636 9
date 0.0000 0.0000 0.0000 3
micro avg 0.8174 0.8333 0.8253 360
macro avg 0.6821 0.5515 0.5669 360
weighted avg 0.8153 0.8333 0.8176 360
2023-10-17 10:54:20,761 ----------------------------------------------------------------------------------------------------