stefan-it's picture
Upload folder using huggingface_hub
3b41ee8
raw
history blame
No virus
24 kB
2023-10-17 10:37:03,892 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:03,893 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 10:37:03,894 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:03,894 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-17 10:37:03,894 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:03,894 Train: 966 sentences
2023-10-17 10:37:03,894 (train_with_dev=False, train_with_test=False)
2023-10-17 10:37:03,894 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:03,894 Training Params:
2023-10-17 10:37:03,894 - learning_rate: "5e-05"
2023-10-17 10:37:03,894 - mini_batch_size: "4"
2023-10-17 10:37:03,894 - max_epochs: "10"
2023-10-17 10:37:03,894 - shuffle: "True"
2023-10-17 10:37:03,894 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:03,894 Plugins:
2023-10-17 10:37:03,894 - TensorboardLogger
2023-10-17 10:37:03,894 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 10:37:03,894 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:03,894 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 10:37:03,894 - metric: "('micro avg', 'f1-score')"
2023-10-17 10:37:03,894 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:03,894 Computation:
2023-10-17 10:37:03,894 - compute on device: cuda:0
2023-10-17 10:37:03,894 - embedding storage: none
2023-10-17 10:37:03,895 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:03,895 Model training base path: "hmbench-ajmc/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-17 10:37:03,895 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:03,895 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:03,895 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 10:37:05,003 epoch 1 - iter 24/242 - loss 4.26705654 - time (sec): 1.11 - samples/sec: 2025.41 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:37:06,141 epoch 1 - iter 48/242 - loss 3.16252160 - time (sec): 2.24 - samples/sec: 2080.28 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:37:07,354 epoch 1 - iter 72/242 - loss 2.25262588 - time (sec): 3.46 - samples/sec: 2122.50 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:37:08,523 epoch 1 - iter 96/242 - loss 1.85353559 - time (sec): 4.63 - samples/sec: 2089.09 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:37:09,707 epoch 1 - iter 120/242 - loss 1.57271846 - time (sec): 5.81 - samples/sec: 2110.01 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:37:10,835 epoch 1 - iter 144/242 - loss 1.36870228 - time (sec): 6.94 - samples/sec: 2106.44 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:37:11,903 epoch 1 - iter 168/242 - loss 1.19860174 - time (sec): 8.01 - samples/sec: 2160.94 - lr: 0.000035 - momentum: 0.000000
2023-10-17 10:37:12,972 epoch 1 - iter 192/242 - loss 1.10814173 - time (sec): 9.08 - samples/sec: 2145.29 - lr: 0.000039 - momentum: 0.000000
2023-10-17 10:37:14,094 epoch 1 - iter 216/242 - loss 1.00786775 - time (sec): 10.20 - samples/sec: 2169.34 - lr: 0.000044 - momentum: 0.000000
2023-10-17 10:37:15,202 epoch 1 - iter 240/242 - loss 0.92435467 - time (sec): 11.31 - samples/sec: 2177.19 - lr: 0.000049 - momentum: 0.000000
2023-10-17 10:37:15,302 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:15,302 EPOCH 1 done: loss 0.9204 - lr: 0.000049
2023-10-17 10:37:15,925 DEV : loss 0.15383611619472504 - f1-score (micro avg) 0.7509
2023-10-17 10:37:15,930 saving best model
2023-10-17 10:37:16,417 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:17,563 epoch 2 - iter 24/242 - loss 0.22040646 - time (sec): 1.14 - samples/sec: 2030.45 - lr: 0.000049 - momentum: 0.000000
2023-10-17 10:37:18,712 epoch 2 - iter 48/242 - loss 0.22115689 - time (sec): 2.29 - samples/sec: 1951.21 - lr: 0.000049 - momentum: 0.000000
2023-10-17 10:37:19,781 epoch 2 - iter 72/242 - loss 0.21889785 - time (sec): 3.36 - samples/sec: 2071.37 - lr: 0.000048 - momentum: 0.000000
2023-10-17 10:37:20,813 epoch 2 - iter 96/242 - loss 0.19591235 - time (sec): 4.39 - samples/sec: 2069.58 - lr: 0.000048 - momentum: 0.000000
2023-10-17 10:37:21,879 epoch 2 - iter 120/242 - loss 0.19387319 - time (sec): 5.46 - samples/sec: 2115.04 - lr: 0.000047 - momentum: 0.000000
2023-10-17 10:37:22,974 epoch 2 - iter 144/242 - loss 0.18492007 - time (sec): 6.56 - samples/sec: 2160.77 - lr: 0.000047 - momentum: 0.000000
2023-10-17 10:37:24,027 epoch 2 - iter 168/242 - loss 0.18355651 - time (sec): 7.61 - samples/sec: 2191.45 - lr: 0.000046 - momentum: 0.000000
2023-10-17 10:37:25,087 epoch 2 - iter 192/242 - loss 0.17936022 - time (sec): 8.67 - samples/sec: 2224.48 - lr: 0.000046 - momentum: 0.000000
2023-10-17 10:37:26,178 epoch 2 - iter 216/242 - loss 0.17205080 - time (sec): 9.76 - samples/sec: 2233.69 - lr: 0.000045 - momentum: 0.000000
2023-10-17 10:37:27,277 epoch 2 - iter 240/242 - loss 0.16709350 - time (sec): 10.86 - samples/sec: 2254.65 - lr: 0.000045 - momentum: 0.000000
2023-10-17 10:37:27,382 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:27,382 EPOCH 2 done: loss 0.1659 - lr: 0.000045
2023-10-17 10:37:28,340 DEV : loss 0.16854800283908844 - f1-score (micro avg) 0.8193
2023-10-17 10:37:28,346 saving best model
2023-10-17 10:37:28,899 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:30,078 epoch 3 - iter 24/242 - loss 0.08547562 - time (sec): 1.18 - samples/sec: 2200.96 - lr: 0.000044 - momentum: 0.000000
2023-10-17 10:37:31,219 epoch 3 - iter 48/242 - loss 0.09930094 - time (sec): 2.32 - samples/sec: 2256.21 - lr: 0.000043 - momentum: 0.000000
2023-10-17 10:37:32,346 epoch 3 - iter 72/242 - loss 0.09997574 - time (sec): 3.44 - samples/sec: 2271.77 - lr: 0.000043 - momentum: 0.000000
2023-10-17 10:37:33,441 epoch 3 - iter 96/242 - loss 0.09649896 - time (sec): 4.54 - samples/sec: 2195.98 - lr: 0.000042 - momentum: 0.000000
2023-10-17 10:37:34,531 epoch 3 - iter 120/242 - loss 0.09690468 - time (sec): 5.63 - samples/sec: 2195.49 - lr: 0.000042 - momentum: 0.000000
2023-10-17 10:37:35,660 epoch 3 - iter 144/242 - loss 0.10360914 - time (sec): 6.76 - samples/sec: 2190.53 - lr: 0.000041 - momentum: 0.000000
2023-10-17 10:37:36,760 epoch 3 - iter 168/242 - loss 0.10467477 - time (sec): 7.86 - samples/sec: 2211.36 - lr: 0.000041 - momentum: 0.000000
2023-10-17 10:37:37,869 epoch 3 - iter 192/242 - loss 0.10251829 - time (sec): 8.97 - samples/sec: 2173.48 - lr: 0.000040 - momentum: 0.000000
2023-10-17 10:37:38,926 epoch 3 - iter 216/242 - loss 0.10081840 - time (sec): 10.02 - samples/sec: 2191.27 - lr: 0.000040 - momentum: 0.000000
2023-10-17 10:37:40,016 epoch 3 - iter 240/242 - loss 0.09893098 - time (sec): 11.11 - samples/sec: 2213.87 - lr: 0.000039 - momentum: 0.000000
2023-10-17 10:37:40,101 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:40,102 EPOCH 3 done: loss 0.0984 - lr: 0.000039
2023-10-17 10:37:40,878 DEV : loss 0.16904763877391815 - f1-score (micro avg) 0.8005
2023-10-17 10:37:40,883 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:41,991 epoch 4 - iter 24/242 - loss 0.06905514 - time (sec): 1.11 - samples/sec: 2155.26 - lr: 0.000038 - momentum: 0.000000
2023-10-17 10:37:43,099 epoch 4 - iter 48/242 - loss 0.07863967 - time (sec): 2.21 - samples/sec: 2147.25 - lr: 0.000038 - momentum: 0.000000
2023-10-17 10:37:44,197 epoch 4 - iter 72/242 - loss 0.07484439 - time (sec): 3.31 - samples/sec: 2064.99 - lr: 0.000037 - momentum: 0.000000
2023-10-17 10:37:45,293 epoch 4 - iter 96/242 - loss 0.07701269 - time (sec): 4.41 - samples/sec: 2066.82 - lr: 0.000037 - momentum: 0.000000
2023-10-17 10:37:46,405 epoch 4 - iter 120/242 - loss 0.07718673 - time (sec): 5.52 - samples/sec: 2118.94 - lr: 0.000036 - momentum: 0.000000
2023-10-17 10:37:47,558 epoch 4 - iter 144/242 - loss 0.07126967 - time (sec): 6.67 - samples/sec: 2189.42 - lr: 0.000036 - momentum: 0.000000
2023-10-17 10:37:48,733 epoch 4 - iter 168/242 - loss 0.07234945 - time (sec): 7.85 - samples/sec: 2181.02 - lr: 0.000035 - momentum: 0.000000
2023-10-17 10:37:49,932 epoch 4 - iter 192/242 - loss 0.07556880 - time (sec): 9.05 - samples/sec: 2191.41 - lr: 0.000035 - momentum: 0.000000
2023-10-17 10:37:51,087 epoch 4 - iter 216/242 - loss 0.07433665 - time (sec): 10.20 - samples/sec: 2183.17 - lr: 0.000034 - momentum: 0.000000
2023-10-17 10:37:52,225 epoch 4 - iter 240/242 - loss 0.07607920 - time (sec): 11.34 - samples/sec: 2168.57 - lr: 0.000033 - momentum: 0.000000
2023-10-17 10:37:52,316 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:52,317 EPOCH 4 done: loss 0.0764 - lr: 0.000033
2023-10-17 10:37:53,112 DEV : loss 0.18921326100826263 - f1-score (micro avg) 0.8275
2023-10-17 10:37:53,118 saving best model
2023-10-17 10:37:53,641 ----------------------------------------------------------------------------------------------------
2023-10-17 10:37:54,788 epoch 5 - iter 24/242 - loss 0.04703834 - time (sec): 1.14 - samples/sec: 2179.92 - lr: 0.000033 - momentum: 0.000000
2023-10-17 10:37:55,915 epoch 5 - iter 48/242 - loss 0.05396331 - time (sec): 2.27 - samples/sec: 2272.55 - lr: 0.000032 - momentum: 0.000000
2023-10-17 10:37:57,003 epoch 5 - iter 72/242 - loss 0.04618997 - time (sec): 3.36 - samples/sec: 2310.46 - lr: 0.000032 - momentum: 0.000000
2023-10-17 10:37:58,089 epoch 5 - iter 96/242 - loss 0.04325612 - time (sec): 4.44 - samples/sec: 2344.72 - lr: 0.000031 - momentum: 0.000000
2023-10-17 10:37:59,230 epoch 5 - iter 120/242 - loss 0.04315048 - time (sec): 5.58 - samples/sec: 2293.40 - lr: 0.000031 - momentum: 0.000000
2023-10-17 10:38:00,379 epoch 5 - iter 144/242 - loss 0.04749375 - time (sec): 6.73 - samples/sec: 2282.81 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:38:01,521 epoch 5 - iter 168/242 - loss 0.04617460 - time (sec): 7.87 - samples/sec: 2239.02 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:38:02,645 epoch 5 - iter 192/242 - loss 0.04659794 - time (sec): 9.00 - samples/sec: 2218.50 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:38:03,752 epoch 5 - iter 216/242 - loss 0.04654054 - time (sec): 10.11 - samples/sec: 2212.46 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:38:04,898 epoch 5 - iter 240/242 - loss 0.04692163 - time (sec): 11.25 - samples/sec: 2191.72 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:38:04,987 ----------------------------------------------------------------------------------------------------
2023-10-17 10:38:04,988 EPOCH 5 done: loss 0.0468 - lr: 0.000028
2023-10-17 10:38:05,786 DEV : loss 0.22835828363895416 - f1-score (micro avg) 0.8076
2023-10-17 10:38:05,793 ----------------------------------------------------------------------------------------------------
2023-10-17 10:38:06,901 epoch 6 - iter 24/242 - loss 0.04790088 - time (sec): 1.11 - samples/sec: 2161.79 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:38:07,986 epoch 6 - iter 48/242 - loss 0.04316168 - time (sec): 2.19 - samples/sec: 2078.71 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:38:09,081 epoch 6 - iter 72/242 - loss 0.03759371 - time (sec): 3.29 - samples/sec: 2177.47 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:38:10,177 epoch 6 - iter 96/242 - loss 0.04423285 - time (sec): 4.38 - samples/sec: 2219.50 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:38:11,296 epoch 6 - iter 120/242 - loss 0.04092062 - time (sec): 5.50 - samples/sec: 2235.26 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:38:12,403 epoch 6 - iter 144/242 - loss 0.03873800 - time (sec): 6.61 - samples/sec: 2233.80 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:38:13,481 epoch 6 - iter 168/242 - loss 0.03907001 - time (sec): 7.69 - samples/sec: 2221.49 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:38:14,577 epoch 6 - iter 192/242 - loss 0.03692244 - time (sec): 8.78 - samples/sec: 2230.85 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:38:15,691 epoch 6 - iter 216/242 - loss 0.03678498 - time (sec): 9.90 - samples/sec: 2229.90 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:38:16,804 epoch 6 - iter 240/242 - loss 0.03565390 - time (sec): 11.01 - samples/sec: 2238.28 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:38:16,885 ----------------------------------------------------------------------------------------------------
2023-10-17 10:38:16,885 EPOCH 6 done: loss 0.0355 - lr: 0.000022
2023-10-17 10:38:17,717 DEV : loss 0.2321670949459076 - f1-score (micro avg) 0.8399
2023-10-17 10:38:17,723 saving best model
2023-10-17 10:38:18,240 ----------------------------------------------------------------------------------------------------
2023-10-17 10:38:19,366 epoch 7 - iter 24/242 - loss 0.01134798 - time (sec): 1.12 - samples/sec: 2068.64 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:38:20,535 epoch 7 - iter 48/242 - loss 0.00754801 - time (sec): 2.29 - samples/sec: 2181.77 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:38:21,653 epoch 7 - iter 72/242 - loss 0.01300960 - time (sec): 3.41 - samples/sec: 2102.00 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:38:22,795 epoch 7 - iter 96/242 - loss 0.01854904 - time (sec): 4.55 - samples/sec: 2125.98 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:38:23,917 epoch 7 - iter 120/242 - loss 0.01962791 - time (sec): 5.67 - samples/sec: 2167.16 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:38:25,020 epoch 7 - iter 144/242 - loss 0.01789323 - time (sec): 6.78 - samples/sec: 2190.67 - lr: 0.000019 - momentum: 0.000000
2023-10-17 10:38:26,159 epoch 7 - iter 168/242 - loss 0.01651922 - time (sec): 7.92 - samples/sec: 2192.11 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:38:27,244 epoch 7 - iter 192/242 - loss 0.01724750 - time (sec): 9.00 - samples/sec: 2185.00 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:38:28,365 epoch 7 - iter 216/242 - loss 0.01826242 - time (sec): 10.12 - samples/sec: 2200.14 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:38:29,481 epoch 7 - iter 240/242 - loss 0.01823347 - time (sec): 11.24 - samples/sec: 2189.17 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:38:29,567 ----------------------------------------------------------------------------------------------------
2023-10-17 10:38:29,567 EPOCH 7 done: loss 0.0181 - lr: 0.000017
2023-10-17 10:38:30,339 DEV : loss 0.22075609862804413 - f1-score (micro avg) 0.8407
2023-10-17 10:38:30,345 saving best model
2023-10-17 10:38:30,911 ----------------------------------------------------------------------------------------------------
2023-10-17 10:38:32,025 epoch 8 - iter 24/242 - loss 0.00594396 - time (sec): 1.11 - samples/sec: 2332.45 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:38:33,141 epoch 8 - iter 48/242 - loss 0.00863514 - time (sec): 2.23 - samples/sec: 2228.01 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:38:34,207 epoch 8 - iter 72/242 - loss 0.00735051 - time (sec): 3.29 - samples/sec: 2262.44 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:38:35,316 epoch 8 - iter 96/242 - loss 0.00590879 - time (sec): 4.40 - samples/sec: 2282.64 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:38:36,404 epoch 8 - iter 120/242 - loss 0.01160036 - time (sec): 5.49 - samples/sec: 2272.80 - lr: 0.000014 - momentum: 0.000000
2023-10-17 10:38:37,558 epoch 8 - iter 144/242 - loss 0.01229182 - time (sec): 6.64 - samples/sec: 2261.48 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:38:38,654 epoch 8 - iter 168/242 - loss 0.01536004 - time (sec): 7.74 - samples/sec: 2256.47 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:38:39,755 epoch 8 - iter 192/242 - loss 0.01414688 - time (sec): 8.84 - samples/sec: 2245.15 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:38:40,972 epoch 8 - iter 216/242 - loss 0.01338241 - time (sec): 10.06 - samples/sec: 2210.19 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:38:42,099 epoch 8 - iter 240/242 - loss 0.01417627 - time (sec): 11.18 - samples/sec: 2199.76 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:38:42,181 ----------------------------------------------------------------------------------------------------
2023-10-17 10:38:42,181 EPOCH 8 done: loss 0.0141 - lr: 0.000011
2023-10-17 10:38:42,998 DEV : loss 0.2549717128276825 - f1-score (micro avg) 0.8338
2023-10-17 10:38:43,005 ----------------------------------------------------------------------------------------------------
2023-10-17 10:38:44,164 epoch 9 - iter 24/242 - loss 0.01687544 - time (sec): 1.16 - samples/sec: 2245.85 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:38:45,246 epoch 9 - iter 48/242 - loss 0.01053392 - time (sec): 2.24 - samples/sec: 2210.65 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:38:46,346 epoch 9 - iter 72/242 - loss 0.00966376 - time (sec): 3.34 - samples/sec: 2136.57 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:38:47,519 epoch 9 - iter 96/242 - loss 0.01454358 - time (sec): 4.51 - samples/sec: 2128.66 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:38:48,733 epoch 9 - iter 120/242 - loss 0.01171230 - time (sec): 5.73 - samples/sec: 2112.89 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:38:49,926 epoch 9 - iter 144/242 - loss 0.01031532 - time (sec): 6.92 - samples/sec: 2126.99 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:38:51,116 epoch 9 - iter 168/242 - loss 0.00962467 - time (sec): 8.11 - samples/sec: 2099.93 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:38:52,307 epoch 9 - iter 192/242 - loss 0.01038682 - time (sec): 9.30 - samples/sec: 2099.75 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:38:53,449 epoch 9 - iter 216/242 - loss 0.00980807 - time (sec): 10.44 - samples/sec: 2109.84 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:38:54,583 epoch 9 - iter 240/242 - loss 0.00882063 - time (sec): 11.58 - samples/sec: 2125.53 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:38:54,687 ----------------------------------------------------------------------------------------------------
2023-10-17 10:38:54,687 EPOCH 9 done: loss 0.0088 - lr: 0.000006
2023-10-17 10:38:55,451 DEV : loss 0.24903777241706848 - f1-score (micro avg) 0.8489
2023-10-17 10:38:55,456 saving best model
2023-10-17 10:38:55,997 ----------------------------------------------------------------------------------------------------
2023-10-17 10:38:57,204 epoch 10 - iter 24/242 - loss 0.02172065 - time (sec): 1.20 - samples/sec: 2019.71 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:38:58,442 epoch 10 - iter 48/242 - loss 0.01225404 - time (sec): 2.44 - samples/sec: 1994.99 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:38:59,531 epoch 10 - iter 72/242 - loss 0.00903661 - time (sec): 3.53 - samples/sec: 2027.85 - lr: 0.000004 - momentum: 0.000000
2023-10-17 10:39:00,637 epoch 10 - iter 96/242 - loss 0.00663820 - time (sec): 4.63 - samples/sec: 2116.32 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:39:01,758 epoch 10 - iter 120/242 - loss 0.00654181 - time (sec): 5.76 - samples/sec: 2151.73 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:39:02,894 epoch 10 - iter 144/242 - loss 0.00618297 - time (sec): 6.89 - samples/sec: 2131.46 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:39:04,036 epoch 10 - iter 168/242 - loss 0.00694130 - time (sec): 8.03 - samples/sec: 2139.22 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:39:05,171 epoch 10 - iter 192/242 - loss 0.00610222 - time (sec): 9.17 - samples/sec: 2136.28 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:39:06,368 epoch 10 - iter 216/242 - loss 0.00558065 - time (sec): 10.37 - samples/sec: 2126.89 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:39:07,516 epoch 10 - iter 240/242 - loss 0.00568743 - time (sec): 11.51 - samples/sec: 2136.26 - lr: 0.000000 - momentum: 0.000000
2023-10-17 10:39:07,602 ----------------------------------------------------------------------------------------------------
2023-10-17 10:39:07,603 EPOCH 10 done: loss 0.0057 - lr: 0.000000
2023-10-17 10:39:08,413 DEV : loss 0.25036904215812683 - f1-score (micro avg) 0.8564
2023-10-17 10:39:08,419 saving best model
2023-10-17 10:39:09,372 ----------------------------------------------------------------------------------------------------
2023-10-17 10:39:09,374 Loading model from best epoch ...
2023-10-17 10:39:10,807 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-17 10:39:11,766
Results:
- F-score (micro) 0.8338
- F-score (macro) 0.5576
- Accuracy 0.7356
By class:
precision recall f1-score support
pers 0.8944 0.9137 0.9039 139
scope 0.8321 0.8837 0.8571 129
work 0.7000 0.7875 0.7412 80
loc 0.4000 0.2222 0.2857 9
date 0.0000 0.0000 0.0000 3
micro avg 0.8182 0.8500 0.8338 360
macro avg 0.5653 0.5614 0.5576 360
weighted avg 0.8091 0.8500 0.8280 360
2023-10-17 10:39:11,766 ----------------------------------------------------------------------------------------------------