stefan-it's picture
Upload folder using huggingface_hub
a957f14
raw
history blame
24.1 kB
2023-10-20 09:18:28,284 ----------------------------------------------------------------------------------------------------
2023-10-20 09:18:28,285 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-20 09:18:28,285 ----------------------------------------------------------------------------------------------------
2023-10-20 09:18:28,285 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-20 09:18:28,285 ----------------------------------------------------------------------------------------------------
2023-10-20 09:18:28,285 Train: 6183 sentences
2023-10-20 09:18:28,285 (train_with_dev=False, train_with_test=False)
2023-10-20 09:18:28,285 ----------------------------------------------------------------------------------------------------
2023-10-20 09:18:28,285 Training Params:
2023-10-20 09:18:28,285 - learning_rate: "5e-05"
2023-10-20 09:18:28,285 - mini_batch_size: "8"
2023-10-20 09:18:28,285 - max_epochs: "10"
2023-10-20 09:18:28,285 - shuffle: "True"
2023-10-20 09:18:28,285 ----------------------------------------------------------------------------------------------------
2023-10-20 09:18:28,285 Plugins:
2023-10-20 09:18:28,285 - TensorboardLogger
2023-10-20 09:18:28,285 - LinearScheduler | warmup_fraction: '0.1'
2023-10-20 09:18:28,285 ----------------------------------------------------------------------------------------------------
2023-10-20 09:18:28,285 Final evaluation on model from best epoch (best-model.pt)
2023-10-20 09:18:28,285 - metric: "('micro avg', 'f1-score')"
2023-10-20 09:18:28,285 ----------------------------------------------------------------------------------------------------
2023-10-20 09:18:28,285 Computation:
2023-10-20 09:18:28,285 - compute on device: cuda:0
2023-10-20 09:18:28,285 - embedding storage: none
2023-10-20 09:18:28,285 ----------------------------------------------------------------------------------------------------
2023-10-20 09:18:28,285 Model training base path: "hmbench-topres19th/en-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-20 09:18:28,285 ----------------------------------------------------------------------------------------------------
2023-10-20 09:18:28,286 ----------------------------------------------------------------------------------------------------
2023-10-20 09:18:28,286 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-20 09:18:30,058 epoch 1 - iter 77/773 - loss 3.29697785 - time (sec): 1.77 - samples/sec: 7168.00 - lr: 0.000005 - momentum: 0.000000
2023-10-20 09:18:31,992 epoch 1 - iter 154/773 - loss 2.98890475 - time (sec): 3.71 - samples/sec: 6576.90 - lr: 0.000010 - momentum: 0.000000
2023-10-20 09:18:33,803 epoch 1 - iter 231/773 - loss 2.52454647 - time (sec): 5.52 - samples/sec: 6624.09 - lr: 0.000015 - momentum: 0.000000
2023-10-20 09:18:35,611 epoch 1 - iter 308/773 - loss 2.02302739 - time (sec): 7.32 - samples/sec: 6736.79 - lr: 0.000020 - momentum: 0.000000
2023-10-20 09:18:37,338 epoch 1 - iter 385/773 - loss 1.67463555 - time (sec): 9.05 - samples/sec: 6790.49 - lr: 0.000025 - momentum: 0.000000
2023-10-20 09:18:39,053 epoch 1 - iter 462/773 - loss 1.45168116 - time (sec): 10.77 - samples/sec: 6786.86 - lr: 0.000030 - momentum: 0.000000
2023-10-20 09:18:40,795 epoch 1 - iter 539/773 - loss 1.27870844 - time (sec): 12.51 - samples/sec: 6856.46 - lr: 0.000035 - momentum: 0.000000
2023-10-20 09:18:42,569 epoch 1 - iter 616/773 - loss 1.14349474 - time (sec): 14.28 - samples/sec: 6915.75 - lr: 0.000040 - momentum: 0.000000
2023-10-20 09:18:44,300 epoch 1 - iter 693/773 - loss 1.04649070 - time (sec): 16.01 - samples/sec: 6911.30 - lr: 0.000045 - momentum: 0.000000
2023-10-20 09:18:46,008 epoch 1 - iter 770/773 - loss 0.96192728 - time (sec): 17.72 - samples/sec: 6982.19 - lr: 0.000050 - momentum: 0.000000
2023-10-20 09:18:46,078 ----------------------------------------------------------------------------------------------------
2023-10-20 09:18:46,078 EPOCH 1 done: loss 0.9579 - lr: 0.000050
2023-10-20 09:18:47,081 DEV : loss 0.13065889477729797 - f1-score (micro avg) 0.0
2023-10-20 09:18:47,093 ----------------------------------------------------------------------------------------------------
2023-10-20 09:18:48,821 epoch 2 - iter 77/773 - loss 0.21812324 - time (sec): 1.73 - samples/sec: 7194.59 - lr: 0.000049 - momentum: 0.000000
2023-10-20 09:18:50,518 epoch 2 - iter 154/773 - loss 0.20906091 - time (sec): 3.42 - samples/sec: 7000.84 - lr: 0.000049 - momentum: 0.000000
2023-10-20 09:18:52,209 epoch 2 - iter 231/773 - loss 0.20993493 - time (sec): 5.12 - samples/sec: 6931.96 - lr: 0.000048 - momentum: 0.000000
2023-10-20 09:18:54,042 epoch 2 - iter 308/773 - loss 0.19690084 - time (sec): 6.95 - samples/sec: 6958.80 - lr: 0.000048 - momentum: 0.000000
2023-10-20 09:18:55,800 epoch 2 - iter 385/773 - loss 0.19499084 - time (sec): 8.71 - samples/sec: 7026.30 - lr: 0.000047 - momentum: 0.000000
2023-10-20 09:18:57,611 epoch 2 - iter 462/773 - loss 0.19104238 - time (sec): 10.52 - samples/sec: 6931.75 - lr: 0.000047 - momentum: 0.000000
2023-10-20 09:18:59,427 epoch 2 - iter 539/773 - loss 0.18943072 - time (sec): 12.33 - samples/sec: 6878.89 - lr: 0.000046 - momentum: 0.000000
2023-10-20 09:19:01,231 epoch 2 - iter 616/773 - loss 0.18586711 - time (sec): 14.14 - samples/sec: 6897.61 - lr: 0.000046 - momentum: 0.000000
2023-10-20 09:19:03,004 epoch 2 - iter 693/773 - loss 0.18459343 - time (sec): 15.91 - samples/sec: 6903.21 - lr: 0.000045 - momentum: 0.000000
2023-10-20 09:19:04,772 epoch 2 - iter 770/773 - loss 0.18021271 - time (sec): 17.68 - samples/sec: 6991.26 - lr: 0.000044 - momentum: 0.000000
2023-10-20 09:19:04,847 ----------------------------------------------------------------------------------------------------
2023-10-20 09:19:04,848 EPOCH 2 done: loss 0.1796 - lr: 0.000044
2023-10-20 09:19:06,215 DEV : loss 0.08806052803993225 - f1-score (micro avg) 0.4817
2023-10-20 09:19:06,227 saving best model
2023-10-20 09:19:06,256 ----------------------------------------------------------------------------------------------------
2023-10-20 09:19:08,034 epoch 3 - iter 77/773 - loss 0.16021178 - time (sec): 1.78 - samples/sec: 6437.36 - lr: 0.000044 - momentum: 0.000000
2023-10-20 09:19:09,769 epoch 3 - iter 154/773 - loss 0.14807212 - time (sec): 3.51 - samples/sec: 6858.33 - lr: 0.000043 - momentum: 0.000000
2023-10-20 09:19:11,509 epoch 3 - iter 231/773 - loss 0.14019763 - time (sec): 5.25 - samples/sec: 6941.49 - lr: 0.000043 - momentum: 0.000000
2023-10-20 09:19:13,236 epoch 3 - iter 308/773 - loss 0.14847424 - time (sec): 6.98 - samples/sec: 7082.25 - lr: 0.000042 - momentum: 0.000000
2023-10-20 09:19:14,934 epoch 3 - iter 385/773 - loss 0.14709744 - time (sec): 8.68 - samples/sec: 7070.55 - lr: 0.000042 - momentum: 0.000000
2023-10-20 09:19:16,681 epoch 3 - iter 462/773 - loss 0.14818828 - time (sec): 10.42 - samples/sec: 7183.86 - lr: 0.000041 - momentum: 0.000000
2023-10-20 09:19:18,432 epoch 3 - iter 539/773 - loss 0.14929835 - time (sec): 12.18 - samples/sec: 7178.10 - lr: 0.000041 - momentum: 0.000000
2023-10-20 09:19:20,162 epoch 3 - iter 616/773 - loss 0.14786552 - time (sec): 13.91 - samples/sec: 7189.40 - lr: 0.000040 - momentum: 0.000000
2023-10-20 09:19:21,899 epoch 3 - iter 693/773 - loss 0.14779447 - time (sec): 15.64 - samples/sec: 7105.63 - lr: 0.000039 - momentum: 0.000000
2023-10-20 09:19:23,605 epoch 3 - iter 770/773 - loss 0.14693652 - time (sec): 17.35 - samples/sec: 7129.47 - lr: 0.000039 - momentum: 0.000000
2023-10-20 09:19:23,676 ----------------------------------------------------------------------------------------------------
2023-10-20 09:19:23,676 EPOCH 3 done: loss 0.1467 - lr: 0.000039
2023-10-20 09:19:24,745 DEV : loss 0.07945284247398376 - f1-score (micro avg) 0.585
2023-10-20 09:19:24,756 saving best model
2023-10-20 09:19:24,790 ----------------------------------------------------------------------------------------------------
2023-10-20 09:19:26,577 epoch 4 - iter 77/773 - loss 0.13996426 - time (sec): 1.79 - samples/sec: 7057.97 - lr: 0.000038 - momentum: 0.000000
2023-10-20 09:19:28,330 epoch 4 - iter 154/773 - loss 0.13144568 - time (sec): 3.54 - samples/sec: 7020.35 - lr: 0.000038 - momentum: 0.000000
2023-10-20 09:19:30,061 epoch 4 - iter 231/773 - loss 0.13656365 - time (sec): 5.27 - samples/sec: 6788.28 - lr: 0.000037 - momentum: 0.000000
2023-10-20 09:19:31,745 epoch 4 - iter 308/773 - loss 0.13712233 - time (sec): 6.95 - samples/sec: 6967.80 - lr: 0.000037 - momentum: 0.000000
2023-10-20 09:19:33,517 epoch 4 - iter 385/773 - loss 0.13771739 - time (sec): 8.73 - samples/sec: 6955.05 - lr: 0.000036 - momentum: 0.000000
2023-10-20 09:19:35,269 epoch 4 - iter 462/773 - loss 0.13408346 - time (sec): 10.48 - samples/sec: 6958.02 - lr: 0.000036 - momentum: 0.000000
2023-10-20 09:19:37,000 epoch 4 - iter 539/773 - loss 0.13073922 - time (sec): 12.21 - samples/sec: 7056.03 - lr: 0.000035 - momentum: 0.000000
2023-10-20 09:19:38,795 epoch 4 - iter 616/773 - loss 0.13003178 - time (sec): 14.00 - samples/sec: 7078.88 - lr: 0.000034 - momentum: 0.000000
2023-10-20 09:19:40,490 epoch 4 - iter 693/773 - loss 0.13006617 - time (sec): 15.70 - samples/sec: 7082.33 - lr: 0.000034 - momentum: 0.000000
2023-10-20 09:19:42,234 epoch 4 - iter 770/773 - loss 0.12994693 - time (sec): 17.44 - samples/sec: 7100.14 - lr: 0.000033 - momentum: 0.000000
2023-10-20 09:19:42,298 ----------------------------------------------------------------------------------------------------
2023-10-20 09:19:42,298 EPOCH 4 done: loss 0.1299 - lr: 0.000033
2023-10-20 09:19:43,375 DEV : loss 0.07809021323919296 - f1-score (micro avg) 0.6071
2023-10-20 09:19:43,386 saving best model
2023-10-20 09:19:43,425 ----------------------------------------------------------------------------------------------------
2023-10-20 09:19:45,188 epoch 5 - iter 77/773 - loss 0.11117031 - time (sec): 1.76 - samples/sec: 6996.81 - lr: 0.000033 - momentum: 0.000000
2023-10-20 09:19:46,900 epoch 5 - iter 154/773 - loss 0.11764714 - time (sec): 3.47 - samples/sec: 6894.04 - lr: 0.000032 - momentum: 0.000000
2023-10-20 09:19:48,635 epoch 5 - iter 231/773 - loss 0.11991202 - time (sec): 5.21 - samples/sec: 6905.69 - lr: 0.000032 - momentum: 0.000000
2023-10-20 09:19:50,432 epoch 5 - iter 308/773 - loss 0.11700014 - time (sec): 7.01 - samples/sec: 7024.72 - lr: 0.000031 - momentum: 0.000000
2023-10-20 09:19:52,197 epoch 5 - iter 385/773 - loss 0.11617792 - time (sec): 8.77 - samples/sec: 7091.97 - lr: 0.000031 - momentum: 0.000000
2023-10-20 09:19:53,901 epoch 5 - iter 462/773 - loss 0.11525526 - time (sec): 10.48 - samples/sec: 7097.11 - lr: 0.000030 - momentum: 0.000000
2023-10-20 09:19:55,632 epoch 5 - iter 539/773 - loss 0.11333040 - time (sec): 12.21 - samples/sec: 7074.53 - lr: 0.000029 - momentum: 0.000000
2023-10-20 09:19:57,353 epoch 5 - iter 616/773 - loss 0.11550300 - time (sec): 13.93 - samples/sec: 7091.84 - lr: 0.000029 - momentum: 0.000000
2023-10-20 09:19:59,064 epoch 5 - iter 693/773 - loss 0.11629394 - time (sec): 15.64 - samples/sec: 7154.16 - lr: 0.000028 - momentum: 0.000000
2023-10-20 09:20:00,821 epoch 5 - iter 770/773 - loss 0.11699897 - time (sec): 17.40 - samples/sec: 7117.38 - lr: 0.000028 - momentum: 0.000000
2023-10-20 09:20:00,890 ----------------------------------------------------------------------------------------------------
2023-10-20 09:20:00,890 EPOCH 5 done: loss 0.1169 - lr: 0.000028
2023-10-20 09:20:01,993 DEV : loss 0.07527422904968262 - f1-score (micro avg) 0.5973
2023-10-20 09:20:02,004 ----------------------------------------------------------------------------------------------------
2023-10-20 09:20:03,679 epoch 6 - iter 77/773 - loss 0.09373771 - time (sec): 1.67 - samples/sec: 7129.91 - lr: 0.000027 - momentum: 0.000000
2023-10-20 09:20:05,341 epoch 6 - iter 154/773 - loss 0.10168393 - time (sec): 3.34 - samples/sec: 7107.39 - lr: 0.000027 - momentum: 0.000000
2023-10-20 09:20:07,029 epoch 6 - iter 231/773 - loss 0.11169160 - time (sec): 5.02 - samples/sec: 7123.67 - lr: 0.000026 - momentum: 0.000000
2023-10-20 09:20:08,763 epoch 6 - iter 308/773 - loss 0.11308050 - time (sec): 6.76 - samples/sec: 7205.74 - lr: 0.000026 - momentum: 0.000000
2023-10-20 09:20:10,476 epoch 6 - iter 385/773 - loss 0.11734393 - time (sec): 8.47 - samples/sec: 7119.05 - lr: 0.000025 - momentum: 0.000000
2023-10-20 09:20:12,242 epoch 6 - iter 462/773 - loss 0.11359539 - time (sec): 10.24 - samples/sec: 7161.69 - lr: 0.000024 - momentum: 0.000000
2023-10-20 09:20:13,952 epoch 6 - iter 539/773 - loss 0.11053055 - time (sec): 11.95 - samples/sec: 7179.03 - lr: 0.000024 - momentum: 0.000000
2023-10-20 09:20:15,718 epoch 6 - iter 616/773 - loss 0.10958303 - time (sec): 13.71 - samples/sec: 7213.13 - lr: 0.000023 - momentum: 0.000000
2023-10-20 09:20:17,429 epoch 6 - iter 693/773 - loss 0.10856611 - time (sec): 15.42 - samples/sec: 7165.92 - lr: 0.000023 - momentum: 0.000000
2023-10-20 09:20:19,168 epoch 6 - iter 770/773 - loss 0.10997793 - time (sec): 17.16 - samples/sec: 7211.89 - lr: 0.000022 - momentum: 0.000000
2023-10-20 09:20:19,240 ----------------------------------------------------------------------------------------------------
2023-10-20 09:20:19,241 EPOCH 6 done: loss 0.1098 - lr: 0.000022
2023-10-20 09:20:20,340 DEV : loss 0.07457771897315979 - f1-score (micro avg) 0.6144
2023-10-20 09:20:20,354 saving best model
2023-10-20 09:20:20,394 ----------------------------------------------------------------------------------------------------
2023-10-20 09:20:22,182 epoch 7 - iter 77/773 - loss 0.10104822 - time (sec): 1.79 - samples/sec: 7514.85 - lr: 0.000022 - momentum: 0.000000
2023-10-20 09:20:23,898 epoch 7 - iter 154/773 - loss 0.09821556 - time (sec): 3.50 - samples/sec: 7100.36 - lr: 0.000021 - momentum: 0.000000
2023-10-20 09:20:25,626 epoch 7 - iter 231/773 - loss 0.09548090 - time (sec): 5.23 - samples/sec: 7242.33 - lr: 0.000021 - momentum: 0.000000
2023-10-20 09:20:27,321 epoch 7 - iter 308/773 - loss 0.10355270 - time (sec): 6.93 - samples/sec: 7143.99 - lr: 0.000020 - momentum: 0.000000
2023-10-20 09:20:29,117 epoch 7 - iter 385/773 - loss 0.10438038 - time (sec): 8.72 - samples/sec: 7124.85 - lr: 0.000019 - momentum: 0.000000
2023-10-20 09:20:30,854 epoch 7 - iter 462/773 - loss 0.10313345 - time (sec): 10.46 - samples/sec: 7148.72 - lr: 0.000019 - momentum: 0.000000
2023-10-20 09:20:32,639 epoch 7 - iter 539/773 - loss 0.10557270 - time (sec): 12.24 - samples/sec: 7134.59 - lr: 0.000018 - momentum: 0.000000
2023-10-20 09:20:34,316 epoch 7 - iter 616/773 - loss 0.10463863 - time (sec): 13.92 - samples/sec: 7183.24 - lr: 0.000018 - momentum: 0.000000
2023-10-20 09:20:36,084 epoch 7 - iter 693/773 - loss 0.10402035 - time (sec): 15.69 - samples/sec: 7119.43 - lr: 0.000017 - momentum: 0.000000
2023-10-20 09:20:37,828 epoch 7 - iter 770/773 - loss 0.10392693 - time (sec): 17.43 - samples/sec: 7102.19 - lr: 0.000017 - momentum: 0.000000
2023-10-20 09:20:37,899 ----------------------------------------------------------------------------------------------------
2023-10-20 09:20:37,899 EPOCH 7 done: loss 0.1037 - lr: 0.000017
2023-10-20 09:20:39,024 DEV : loss 0.07657597213983536 - f1-score (micro avg) 0.6242
2023-10-20 09:20:39,037 saving best model
2023-10-20 09:20:39,075 ----------------------------------------------------------------------------------------------------
2023-10-20 09:20:40,803 epoch 8 - iter 77/773 - loss 0.08347295 - time (sec): 1.73 - samples/sec: 7045.81 - lr: 0.000016 - momentum: 0.000000
2023-10-20 09:20:42,539 epoch 8 - iter 154/773 - loss 0.10176786 - time (sec): 3.46 - samples/sec: 7193.34 - lr: 0.000016 - momentum: 0.000000
2023-10-20 09:20:44,261 epoch 8 - iter 231/773 - loss 0.10219115 - time (sec): 5.19 - samples/sec: 7117.05 - lr: 0.000015 - momentum: 0.000000
2023-10-20 09:20:45,993 epoch 8 - iter 308/773 - loss 0.09661994 - time (sec): 6.92 - samples/sec: 7120.96 - lr: 0.000014 - momentum: 0.000000
2023-10-20 09:20:47,714 epoch 8 - iter 385/773 - loss 0.09443349 - time (sec): 8.64 - samples/sec: 7214.99 - lr: 0.000014 - momentum: 0.000000
2023-10-20 09:20:49,480 epoch 8 - iter 462/773 - loss 0.09714750 - time (sec): 10.40 - samples/sec: 7273.85 - lr: 0.000013 - momentum: 0.000000
2023-10-20 09:20:51,177 epoch 8 - iter 539/773 - loss 0.09639649 - time (sec): 12.10 - samples/sec: 7201.04 - lr: 0.000013 - momentum: 0.000000
2023-10-20 09:20:52,899 epoch 8 - iter 616/773 - loss 0.09879490 - time (sec): 13.82 - samples/sec: 7136.22 - lr: 0.000012 - momentum: 0.000000
2023-10-20 09:20:54,622 epoch 8 - iter 693/773 - loss 0.09745774 - time (sec): 15.55 - samples/sec: 7128.81 - lr: 0.000012 - momentum: 0.000000
2023-10-20 09:20:56,382 epoch 8 - iter 770/773 - loss 0.09749713 - time (sec): 17.31 - samples/sec: 7163.01 - lr: 0.000011 - momentum: 0.000000
2023-10-20 09:20:56,439 ----------------------------------------------------------------------------------------------------
2023-10-20 09:20:56,439 EPOCH 8 done: loss 0.0973 - lr: 0.000011
2023-10-20 09:20:57,540 DEV : loss 0.07944915443658829 - f1-score (micro avg) 0.6269
2023-10-20 09:20:57,552 saving best model
2023-10-20 09:20:57,590 ----------------------------------------------------------------------------------------------------
2023-10-20 09:20:59,260 epoch 9 - iter 77/773 - loss 0.09459646 - time (sec): 1.67 - samples/sec: 7330.85 - lr: 0.000011 - momentum: 0.000000
2023-10-20 09:21:00,859 epoch 9 - iter 154/773 - loss 0.09513909 - time (sec): 3.27 - samples/sec: 7555.16 - lr: 0.000010 - momentum: 0.000000
2023-10-20 09:21:02,459 epoch 9 - iter 231/773 - loss 0.08891959 - time (sec): 4.87 - samples/sec: 7837.97 - lr: 0.000009 - momentum: 0.000000
2023-10-20 09:21:04,135 epoch 9 - iter 308/773 - loss 0.09023339 - time (sec): 6.54 - samples/sec: 7649.30 - lr: 0.000009 - momentum: 0.000000
2023-10-20 09:21:05,867 epoch 9 - iter 385/773 - loss 0.09340261 - time (sec): 8.28 - samples/sec: 7652.07 - lr: 0.000008 - momentum: 0.000000
2023-10-20 09:21:07,610 epoch 9 - iter 462/773 - loss 0.09643093 - time (sec): 10.02 - samples/sec: 7510.55 - lr: 0.000008 - momentum: 0.000000
2023-10-20 09:21:09,362 epoch 9 - iter 539/773 - loss 0.09664935 - time (sec): 11.77 - samples/sec: 7470.84 - lr: 0.000007 - momentum: 0.000000
2023-10-20 09:21:11,036 epoch 9 - iter 616/773 - loss 0.09738019 - time (sec): 13.45 - samples/sec: 7447.59 - lr: 0.000007 - momentum: 0.000000
2023-10-20 09:21:12,730 epoch 9 - iter 693/773 - loss 0.09569268 - time (sec): 15.14 - samples/sec: 7368.56 - lr: 0.000006 - momentum: 0.000000
2023-10-20 09:21:14,449 epoch 9 - iter 770/773 - loss 0.09441439 - time (sec): 16.86 - samples/sec: 7348.71 - lr: 0.000006 - momentum: 0.000000
2023-10-20 09:21:14,509 ----------------------------------------------------------------------------------------------------
2023-10-20 09:21:14,509 EPOCH 9 done: loss 0.0943 - lr: 0.000006
2023-10-20 09:21:15,576 DEV : loss 0.08020081371068954 - f1-score (micro avg) 0.6476
2023-10-20 09:21:15,587 saving best model
2023-10-20 09:21:15,626 ----------------------------------------------------------------------------------------------------
2023-10-20 09:21:17,352 epoch 10 - iter 77/773 - loss 0.10683426 - time (sec): 1.73 - samples/sec: 6965.15 - lr: 0.000005 - momentum: 0.000000
2023-10-20 09:21:19,118 epoch 10 - iter 154/773 - loss 0.09922985 - time (sec): 3.49 - samples/sec: 7124.64 - lr: 0.000005 - momentum: 0.000000
2023-10-20 09:21:20,969 epoch 10 - iter 231/773 - loss 0.09606770 - time (sec): 5.34 - samples/sec: 7135.38 - lr: 0.000004 - momentum: 0.000000
2023-10-20 09:21:22,727 epoch 10 - iter 308/773 - loss 0.09168936 - time (sec): 7.10 - samples/sec: 7163.36 - lr: 0.000003 - momentum: 0.000000
2023-10-20 09:21:24,454 epoch 10 - iter 385/773 - loss 0.09319490 - time (sec): 8.83 - samples/sec: 7145.89 - lr: 0.000003 - momentum: 0.000000
2023-10-20 09:21:26,193 epoch 10 - iter 462/773 - loss 0.09108964 - time (sec): 10.57 - samples/sec: 7122.69 - lr: 0.000002 - momentum: 0.000000
2023-10-20 09:21:27,920 epoch 10 - iter 539/773 - loss 0.08863410 - time (sec): 12.29 - samples/sec: 7160.35 - lr: 0.000002 - momentum: 0.000000
2023-10-20 09:21:29,676 epoch 10 - iter 616/773 - loss 0.08839821 - time (sec): 14.05 - samples/sec: 7084.18 - lr: 0.000001 - momentum: 0.000000
2023-10-20 09:21:31,407 epoch 10 - iter 693/773 - loss 0.09194621 - time (sec): 15.78 - samples/sec: 7085.52 - lr: 0.000001 - momentum: 0.000000
2023-10-20 09:21:33,142 epoch 10 - iter 770/773 - loss 0.09345324 - time (sec): 17.52 - samples/sec: 7078.58 - lr: 0.000000 - momentum: 0.000000
2023-10-20 09:21:33,207 ----------------------------------------------------------------------------------------------------
2023-10-20 09:21:33,207 EPOCH 10 done: loss 0.0933 - lr: 0.000000
2023-10-20 09:21:34,289 DEV : loss 0.08138395845890045 - f1-score (micro avg) 0.6491
2023-10-20 09:21:34,302 saving best model
2023-10-20 09:21:34,370 ----------------------------------------------------------------------------------------------------
2023-10-20 09:21:34,371 Loading model from best epoch ...
2023-10-20 09:21:34,443 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-20 09:21:37,354
Results:
- F-score (micro) 0.5866
- F-score (macro) 0.3173
- Accuracy 0.4314
By class:
precision recall f1-score support
LOC 0.6274 0.6818 0.6535 946
BUILDING 0.3400 0.0919 0.1447 185
STREET 0.5556 0.0893 0.1538 56
micro avg 0.6136 0.5619 0.5866 1187
macro avg 0.5077 0.2877 0.3173 1187
weighted avg 0.5792 0.5619 0.5506 1187
2023-10-20 09:21:37,354 ----------------------------------------------------------------------------------------------------