stefan-it's picture
Upload folder using huggingface_hub
490db8d
2023-10-17 10:54:49,837 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:49,838 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 10:54:49,838 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:49,838 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-17 10:54:49,838 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:49,839 Train: 966 sentences
2023-10-17 10:54:49,839 (train_with_dev=False, train_with_test=False)
2023-10-17 10:54:49,839 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:49,839 Training Params:
2023-10-17 10:54:49,839 - learning_rate: "5e-05"
2023-10-17 10:54:49,839 - mini_batch_size: "4"
2023-10-17 10:54:49,839 - max_epochs: "10"
2023-10-17 10:54:49,839 - shuffle: "True"
2023-10-17 10:54:49,839 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:49,839 Plugins:
2023-10-17 10:54:49,839 - TensorboardLogger
2023-10-17 10:54:49,839 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 10:54:49,839 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:49,839 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 10:54:49,839 - metric: "('micro avg', 'f1-score')"
2023-10-17 10:54:49,839 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:49,839 Computation:
2023-10-17 10:54:49,839 - compute on device: cuda:0
2023-10-17 10:54:49,839 - embedding storage: none
2023-10-17 10:54:49,839 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:49,839 Model training base path: "hmbench-ajmc/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 10:54:49,839 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:49,839 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:49,839 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 10:54:50,943 epoch 1 - iter 24/242 - loss 3.92639461 - time (sec): 1.10 - samples/sec: 2185.69 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:54:52,047 epoch 1 - iter 48/242 - loss 3.04296997 - time (sec): 2.21 - samples/sec: 2152.70 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:54:53,168 epoch 1 - iter 72/242 - loss 2.24676922 - time (sec): 3.33 - samples/sec: 2125.43 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:54:54,279 epoch 1 - iter 96/242 - loss 1.75436762 - time (sec): 4.44 - samples/sec: 2199.27 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:54:55,414 epoch 1 - iter 120/242 - loss 1.50738487 - time (sec): 5.57 - samples/sec: 2161.27 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:54:56,565 epoch 1 - iter 144/242 - loss 1.29770685 - time (sec): 6.73 - samples/sec: 2174.19 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:54:57,742 epoch 1 - iter 168/242 - loss 1.13283400 - time (sec): 7.90 - samples/sec: 2190.62 - lr: 0.000035 - momentum: 0.000000
2023-10-17 10:54:58,906 epoch 1 - iter 192/242 - loss 1.01056784 - time (sec): 9.07 - samples/sec: 2219.15 - lr: 0.000039 - momentum: 0.000000
2023-10-17 10:55:00,077 epoch 1 - iter 216/242 - loss 0.94261585 - time (sec): 10.24 - samples/sec: 2183.51 - lr: 0.000044 - momentum: 0.000000
2023-10-17 10:55:01,189 epoch 1 - iter 240/242 - loss 0.87690795 - time (sec): 11.35 - samples/sec: 2168.80 - lr: 0.000049 - momentum: 0.000000
2023-10-17 10:55:01,279 ----------------------------------------------------------------------------------------------------
2023-10-17 10:55:01,279 EPOCH 1 done: loss 0.8745 - lr: 0.000049
2023-10-17 10:55:01,866 DEV : loss 0.17245863378047943 - f1-score (micro avg) 0.6897
2023-10-17 10:55:01,871 saving best model
2023-10-17 10:55:02,224 ----------------------------------------------------------------------------------------------------
2023-10-17 10:55:03,328 epoch 2 - iter 24/242 - loss 0.12720849 - time (sec): 1.10 - samples/sec: 2082.24 - lr: 0.000049 - momentum: 0.000000
2023-10-17 10:55:04,500 epoch 2 - iter 48/242 - loss 0.16247962 - time (sec): 2.27 - samples/sec: 2053.44 - lr: 0.000049 - momentum: 0.000000
2023-10-17 10:55:05,684 epoch 2 - iter 72/242 - loss 0.16688621 - time (sec): 3.46 - samples/sec: 2054.06 - lr: 0.000048 - momentum: 0.000000
2023-10-17 10:55:06,852 epoch 2 - iter 96/242 - loss 0.15980410 - time (sec): 4.63 - samples/sec: 2056.90 - lr: 0.000048 - momentum: 0.000000
2023-10-17 10:55:08,050 epoch 2 - iter 120/242 - loss 0.16033847 - time (sec): 5.82 - samples/sec: 2101.10 - lr: 0.000047 - momentum: 0.000000
2023-10-17 10:55:09,238 epoch 2 - iter 144/242 - loss 0.16292839 - time (sec): 7.01 - samples/sec: 2073.75 - lr: 0.000047 - momentum: 0.000000
2023-10-17 10:55:10,489 epoch 2 - iter 168/242 - loss 0.16233553 - time (sec): 8.26 - samples/sec: 2073.82 - lr: 0.000046 - momentum: 0.000000
2023-10-17 10:55:11,704 epoch 2 - iter 192/242 - loss 0.15837280 - time (sec): 9.48 - samples/sec: 2076.17 - lr: 0.000046 - momentum: 0.000000
2023-10-17 10:55:12,877 epoch 2 - iter 216/242 - loss 0.16692967 - time (sec): 10.65 - samples/sec: 2077.61 - lr: 0.000045 - momentum: 0.000000
2023-10-17 10:55:14,005 epoch 2 - iter 240/242 - loss 0.16298133 - time (sec): 11.78 - samples/sec: 2085.51 - lr: 0.000045 - momentum: 0.000000
2023-10-17 10:55:14,099 ----------------------------------------------------------------------------------------------------
2023-10-17 10:55:14,099 EPOCH 2 done: loss 0.1622 - lr: 0.000045
2023-10-17 10:55:14,847 DEV : loss 0.14140677452087402 - f1-score (micro avg) 0.817
2023-10-17 10:55:14,852 saving best model
2023-10-17 10:55:15,405 ----------------------------------------------------------------------------------------------------
2023-10-17 10:55:16,610 epoch 3 - iter 24/242 - loss 0.10393024 - time (sec): 1.20 - samples/sec: 2010.78 - lr: 0.000044 - momentum: 0.000000
2023-10-17 10:55:17,774 epoch 3 - iter 48/242 - loss 0.09928119 - time (sec): 2.36 - samples/sec: 2100.50 - lr: 0.000043 - momentum: 0.000000
2023-10-17 10:55:18,907 epoch 3 - iter 72/242 - loss 0.08611132 - time (sec): 3.50 - samples/sec: 2201.33 - lr: 0.000043 - momentum: 0.000000
2023-10-17 10:55:19,983 epoch 3 - iter 96/242 - loss 0.09746320 - time (sec): 4.57 - samples/sec: 2216.12 - lr: 0.000042 - momentum: 0.000000
2023-10-17 10:55:21,108 epoch 3 - iter 120/242 - loss 0.10170700 - time (sec): 5.70 - samples/sec: 2202.86 - lr: 0.000042 - momentum: 0.000000
2023-10-17 10:55:22,214 epoch 3 - iter 144/242 - loss 0.10049154 - time (sec): 6.80 - samples/sec: 2207.65 - lr: 0.000041 - momentum: 0.000000
2023-10-17 10:55:23,359 epoch 3 - iter 168/242 - loss 0.10292233 - time (sec): 7.95 - samples/sec: 2214.43 - lr: 0.000041 - momentum: 0.000000
2023-10-17 10:55:24,450 epoch 3 - iter 192/242 - loss 0.10253347 - time (sec): 9.04 - samples/sec: 2164.84 - lr: 0.000040 - momentum: 0.000000
2023-10-17 10:55:25,681 epoch 3 - iter 216/242 - loss 0.10587963 - time (sec): 10.27 - samples/sec: 2172.46 - lr: 0.000040 - momentum: 0.000000
2023-10-17 10:55:26,765 epoch 3 - iter 240/242 - loss 0.10863240 - time (sec): 11.36 - samples/sec: 2162.00 - lr: 0.000039 - momentum: 0.000000
2023-10-17 10:55:26,860 ----------------------------------------------------------------------------------------------------
2023-10-17 10:55:26,860 EPOCH 3 done: loss 0.1094 - lr: 0.000039
2023-10-17 10:55:27,755 DEV : loss 0.14144273102283478 - f1-score (micro avg) 0.7744
2023-10-17 10:55:27,761 ----------------------------------------------------------------------------------------------------
2023-10-17 10:55:28,875 epoch 4 - iter 24/242 - loss 0.08604125 - time (sec): 1.11 - samples/sec: 1861.28 - lr: 0.000038 - momentum: 0.000000
2023-10-17 10:55:30,027 epoch 4 - iter 48/242 - loss 0.06395054 - time (sec): 2.27 - samples/sec: 2012.14 - lr: 0.000038 - momentum: 0.000000
2023-10-17 10:55:31,147 epoch 4 - iter 72/242 - loss 0.05707292 - time (sec): 3.38 - samples/sec: 2055.84 - lr: 0.000037 - momentum: 0.000000
2023-10-17 10:55:32,303 epoch 4 - iter 96/242 - loss 0.06229815 - time (sec): 4.54 - samples/sec: 2104.22 - lr: 0.000037 - momentum: 0.000000
2023-10-17 10:55:33,429 epoch 4 - iter 120/242 - loss 0.07151279 - time (sec): 5.67 - samples/sec: 2156.06 - lr: 0.000036 - momentum: 0.000000
2023-10-17 10:55:34,510 epoch 4 - iter 144/242 - loss 0.07093158 - time (sec): 6.75 - samples/sec: 2135.41 - lr: 0.000036 - momentum: 0.000000
2023-10-17 10:55:35,646 epoch 4 - iter 168/242 - loss 0.07679301 - time (sec): 7.88 - samples/sec: 2132.70 - lr: 0.000035 - momentum: 0.000000
2023-10-17 10:55:36,781 epoch 4 - iter 192/242 - loss 0.07394390 - time (sec): 9.02 - samples/sec: 2145.66 - lr: 0.000035 - momentum: 0.000000
2023-10-17 10:55:37,969 epoch 4 - iter 216/242 - loss 0.07397380 - time (sec): 10.21 - samples/sec: 2145.94 - lr: 0.000034 - momentum: 0.000000
2023-10-17 10:55:39,066 epoch 4 - iter 240/242 - loss 0.07206965 - time (sec): 11.30 - samples/sec: 2178.72 - lr: 0.000033 - momentum: 0.000000
2023-10-17 10:55:39,154 ----------------------------------------------------------------------------------------------------
2023-10-17 10:55:39,154 EPOCH 4 done: loss 0.0717 - lr: 0.000033
2023-10-17 10:55:39,908 DEV : loss 0.17283324897289276 - f1-score (micro avg) 0.836
2023-10-17 10:55:39,913 saving best model
2023-10-17 10:55:40,419 ----------------------------------------------------------------------------------------------------
2023-10-17 10:55:41,545 epoch 5 - iter 24/242 - loss 0.05122857 - time (sec): 1.12 - samples/sec: 1932.26 - lr: 0.000033 - momentum: 0.000000
2023-10-17 10:55:42,636 epoch 5 - iter 48/242 - loss 0.05981318 - time (sec): 2.22 - samples/sec: 2005.79 - lr: 0.000032 - momentum: 0.000000
2023-10-17 10:55:43,777 epoch 5 - iter 72/242 - loss 0.05077187 - time (sec): 3.36 - samples/sec: 2097.46 - lr: 0.000032 - momentum: 0.000000
2023-10-17 10:55:44,923 epoch 5 - iter 96/242 - loss 0.05457487 - time (sec): 4.50 - samples/sec: 2081.38 - lr: 0.000031 - momentum: 0.000000
2023-10-17 10:55:46,017 epoch 5 - iter 120/242 - loss 0.04995074 - time (sec): 5.60 - samples/sec: 2103.37 - lr: 0.000031 - momentum: 0.000000
2023-10-17 10:55:47,131 epoch 5 - iter 144/242 - loss 0.05286292 - time (sec): 6.71 - samples/sec: 2172.68 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:55:48,242 epoch 5 - iter 168/242 - loss 0.05572414 - time (sec): 7.82 - samples/sec: 2154.82 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:55:49,359 epoch 5 - iter 192/242 - loss 0.05767409 - time (sec): 8.94 - samples/sec: 2164.19 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:55:50,471 epoch 5 - iter 216/242 - loss 0.05614593 - time (sec): 10.05 - samples/sec: 2176.22 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:55:51,624 epoch 5 - iter 240/242 - loss 0.05545451 - time (sec): 11.20 - samples/sec: 2200.52 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:55:51,711 ----------------------------------------------------------------------------------------------------
2023-10-17 10:55:51,711 EPOCH 5 done: loss 0.0553 - lr: 0.000028
2023-10-17 10:55:52,477 DEV : loss 0.20903627574443817 - f1-score (micro avg) 0.8327
2023-10-17 10:55:52,482 ----------------------------------------------------------------------------------------------------
2023-10-17 10:55:53,546 epoch 6 - iter 24/242 - loss 0.01747608 - time (sec): 1.06 - samples/sec: 2304.10 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:55:54,613 epoch 6 - iter 48/242 - loss 0.03186200 - time (sec): 2.13 - samples/sec: 2322.04 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:55:55,673 epoch 6 - iter 72/242 - loss 0.02827303 - time (sec): 3.19 - samples/sec: 2354.56 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:55:56,712 epoch 6 - iter 96/242 - loss 0.03116679 - time (sec): 4.23 - samples/sec: 2299.45 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:55:57,776 epoch 6 - iter 120/242 - loss 0.03185366 - time (sec): 5.29 - samples/sec: 2307.22 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:55:58,823 epoch 6 - iter 144/242 - loss 0.03273987 - time (sec): 6.34 - samples/sec: 2250.37 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:55:59,922 epoch 6 - iter 168/242 - loss 0.03376056 - time (sec): 7.44 - samples/sec: 2293.07 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:56:01,000 epoch 6 - iter 192/242 - loss 0.03120273 - time (sec): 8.52 - samples/sec: 2294.16 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:56:02,062 epoch 6 - iter 216/242 - loss 0.03176433 - time (sec): 9.58 - samples/sec: 2289.35 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:56:03,138 epoch 6 - iter 240/242 - loss 0.03443780 - time (sec): 10.65 - samples/sec: 2305.51 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:56:03,222 ----------------------------------------------------------------------------------------------------
2023-10-17 10:56:03,222 EPOCH 6 done: loss 0.0342 - lr: 0.000022
2023-10-17 10:56:03,970 DEV : loss 0.20771022140979767 - f1-score (micro avg) 0.8579
2023-10-17 10:56:03,976 saving best model
2023-10-17 10:56:04,527 ----------------------------------------------------------------------------------------------------
2023-10-17 10:56:05,657 epoch 7 - iter 24/242 - loss 0.01389319 - time (sec): 1.13 - samples/sec: 2125.86 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:56:06,806 epoch 7 - iter 48/242 - loss 0.03079333 - time (sec): 2.27 - samples/sec: 2048.45 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:56:07,974 epoch 7 - iter 72/242 - loss 0.02531122 - time (sec): 3.44 - samples/sec: 2100.59 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:56:09,106 epoch 7 - iter 96/242 - loss 0.02382481 - time (sec): 4.57 - samples/sec: 2112.74 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:56:10,217 epoch 7 - iter 120/242 - loss 0.02394786 - time (sec): 5.69 - samples/sec: 2156.15 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:56:11,370 epoch 7 - iter 144/242 - loss 0.02896998 - time (sec): 6.84 - samples/sec: 2131.58 - lr: 0.000019 - momentum: 0.000000
2023-10-17 10:56:12,497 epoch 7 - iter 168/242 - loss 0.02898444 - time (sec): 7.97 - samples/sec: 2124.17 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:56:13,633 epoch 7 - iter 192/242 - loss 0.02772235 - time (sec): 9.10 - samples/sec: 2136.23 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:56:14,779 epoch 7 - iter 216/242 - loss 0.03031956 - time (sec): 10.25 - samples/sec: 2151.31 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:56:15,892 epoch 7 - iter 240/242 - loss 0.02928220 - time (sec): 11.36 - samples/sec: 2168.91 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:56:15,984 ----------------------------------------------------------------------------------------------------
2023-10-17 10:56:15,984 EPOCH 7 done: loss 0.0292 - lr: 0.000017
2023-10-17 10:56:16,738 DEV : loss 0.2137136161327362 - f1-score (micro avg) 0.851
2023-10-17 10:56:16,743 ----------------------------------------------------------------------------------------------------
2023-10-17 10:56:17,874 epoch 8 - iter 24/242 - loss 0.03087610 - time (sec): 1.13 - samples/sec: 1895.79 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:56:19,021 epoch 8 - iter 48/242 - loss 0.02349420 - time (sec): 2.28 - samples/sec: 2107.16 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:56:20,219 epoch 8 - iter 72/242 - loss 0.01962876 - time (sec): 3.47 - samples/sec: 2060.49 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:56:21,391 epoch 8 - iter 96/242 - loss 0.01966456 - time (sec): 4.65 - samples/sec: 2156.26 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:56:22,513 epoch 8 - iter 120/242 - loss 0.01867760 - time (sec): 5.77 - samples/sec: 2171.79 - lr: 0.000014 - momentum: 0.000000
2023-10-17 10:56:23,627 epoch 8 - iter 144/242 - loss 0.01698197 - time (sec): 6.88 - samples/sec: 2177.10 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:56:24,717 epoch 8 - iter 168/242 - loss 0.01799341 - time (sec): 7.97 - samples/sec: 2153.06 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:56:25,815 epoch 8 - iter 192/242 - loss 0.01797490 - time (sec): 9.07 - samples/sec: 2164.11 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:56:26,937 epoch 8 - iter 216/242 - loss 0.01687743 - time (sec): 10.19 - samples/sec: 2183.19 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:56:28,061 epoch 8 - iter 240/242 - loss 0.01884973 - time (sec): 11.32 - samples/sec: 2174.54 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:56:28,151 ----------------------------------------------------------------------------------------------------
2023-10-17 10:56:28,151 EPOCH 8 done: loss 0.0188 - lr: 0.000011
2023-10-17 10:56:28,897 DEV : loss 0.23500145971775055 - f1-score (micro avg) 0.8308
2023-10-17 10:56:28,902 ----------------------------------------------------------------------------------------------------
2023-10-17 10:56:30,021 epoch 9 - iter 24/242 - loss 0.01912375 - time (sec): 1.12 - samples/sec: 2236.59 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:56:31,185 epoch 9 - iter 48/242 - loss 0.01160460 - time (sec): 2.28 - samples/sec: 2103.10 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:56:32,436 epoch 9 - iter 72/242 - loss 0.01393317 - time (sec): 3.53 - samples/sec: 1984.39 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:56:33,660 epoch 9 - iter 96/242 - loss 0.01317219 - time (sec): 4.76 - samples/sec: 1958.55 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:56:34,862 epoch 9 - iter 120/242 - loss 0.01130146 - time (sec): 5.96 - samples/sec: 1941.54 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:56:36,027 epoch 9 - iter 144/242 - loss 0.01143745 - time (sec): 7.12 - samples/sec: 1998.38 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:56:37,136 epoch 9 - iter 168/242 - loss 0.01338574 - time (sec): 8.23 - samples/sec: 2043.43 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:56:38,306 epoch 9 - iter 192/242 - loss 0.01238712 - time (sec): 9.40 - samples/sec: 2074.54 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:56:39,479 epoch 9 - iter 216/242 - loss 0.01393675 - time (sec): 10.58 - samples/sec: 2099.41 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:56:40,612 epoch 9 - iter 240/242 - loss 0.01450758 - time (sec): 11.71 - samples/sec: 2100.88 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:56:40,709 ----------------------------------------------------------------------------------------------------
2023-10-17 10:56:40,709 EPOCH 9 done: loss 0.0144 - lr: 0.000006
2023-10-17 10:56:41,455 DEV : loss 0.2329409271478653 - f1-score (micro avg) 0.8317
2023-10-17 10:56:41,460 ----------------------------------------------------------------------------------------------------
2023-10-17 10:56:42,621 epoch 10 - iter 24/242 - loss 0.00912702 - time (sec): 1.16 - samples/sec: 2040.82 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:56:43,716 epoch 10 - iter 48/242 - loss 0.01040602 - time (sec): 2.25 - samples/sec: 2024.67 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:56:44,872 epoch 10 - iter 72/242 - loss 0.01081807 - time (sec): 3.41 - samples/sec: 2099.24 - lr: 0.000004 - momentum: 0.000000
2023-10-17 10:56:45,995 epoch 10 - iter 96/242 - loss 0.01109546 - time (sec): 4.53 - samples/sec: 2137.61 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:56:47,133 epoch 10 - iter 120/242 - loss 0.01296140 - time (sec): 5.67 - samples/sec: 2241.61 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:56:48,242 epoch 10 - iter 144/242 - loss 0.01117121 - time (sec): 6.78 - samples/sec: 2192.51 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:56:49,346 epoch 10 - iter 168/242 - loss 0.00974678 - time (sec): 7.88 - samples/sec: 2164.81 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:56:50,489 epoch 10 - iter 192/242 - loss 0.00970789 - time (sec): 9.03 - samples/sec: 2182.58 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:56:51,625 epoch 10 - iter 216/242 - loss 0.00940542 - time (sec): 10.16 - samples/sec: 2206.48 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:56:52,740 epoch 10 - iter 240/242 - loss 0.00865941 - time (sec): 11.28 - samples/sec: 2178.99 - lr: 0.000000 - momentum: 0.000000
2023-10-17 10:56:52,823 ----------------------------------------------------------------------------------------------------
2023-10-17 10:56:52,824 EPOCH 10 done: loss 0.0086 - lr: 0.000000
2023-10-17 10:56:53,569 DEV : loss 0.24286319315433502 - f1-score (micro avg) 0.8363
2023-10-17 10:56:53,947 ----------------------------------------------------------------------------------------------------
2023-10-17 10:56:53,949 Loading model from best epoch ...
2023-10-17 10:56:55,284 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-17 10:56:56,147
Results:
- F-score (micro) 0.8379
- F-score (macro) 0.575
- Accuracy 0.7332
By class:
precision recall f1-score support
pers 0.8750 0.9065 0.8905 139
scope 0.8321 0.8837 0.8571 129
work 0.7412 0.7875 0.7636 80
loc 1.0000 0.2222 0.3636 9
date 0.0000 0.0000 0.0000 3
micro avg 0.8288 0.8472 0.8379 360
macro avg 0.6897 0.5600 0.5750 360
weighted avg 0.8257 0.8472 0.8297 360
2023-10-17 10:56:56,147 ----------------------------------------------------------------------------------------------------