stefan-it's picture
Upload folder using huggingface_hub
a722b28
raw
history blame
23.8 kB
2023-10-17 00:10:46,564 ----------------------------------------------------------------------------------------------------
2023-10-17 00:10:46,565 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 00:10:46,565 ----------------------------------------------------------------------------------------------------
2023-10-17 00:10:46,565 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-17 00:10:46,565 ----------------------------------------------------------------------------------------------------
2023-10-17 00:10:46,565 Train: 6183 sentences
2023-10-17 00:10:46,565 (train_with_dev=False, train_with_test=False)
2023-10-17 00:10:46,565 ----------------------------------------------------------------------------------------------------
2023-10-17 00:10:46,565 Training Params:
2023-10-17 00:10:46,565 - learning_rate: "5e-05"
2023-10-17 00:10:46,565 - mini_batch_size: "8"
2023-10-17 00:10:46,565 - max_epochs: "10"
2023-10-17 00:10:46,565 - shuffle: "True"
2023-10-17 00:10:46,566 ----------------------------------------------------------------------------------------------------
2023-10-17 00:10:46,566 Plugins:
2023-10-17 00:10:46,566 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 00:10:46,566 ----------------------------------------------------------------------------------------------------
2023-10-17 00:10:46,566 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 00:10:46,566 - metric: "('micro avg', 'f1-score')"
2023-10-17 00:10:46,566 ----------------------------------------------------------------------------------------------------
2023-10-17 00:10:46,566 Computation:
2023-10-17 00:10:46,566 - compute on device: cuda:0
2023-10-17 00:10:46,566 - embedding storage: none
2023-10-17 00:10:46,566 ----------------------------------------------------------------------------------------------------
2023-10-17 00:10:46,566 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-17 00:10:46,566 ----------------------------------------------------------------------------------------------------
2023-10-17 00:10:46,566 ----------------------------------------------------------------------------------------------------
2023-10-17 00:10:51,045 epoch 1 - iter 77/773 - loss 2.01913050 - time (sec): 4.48 - samples/sec: 2875.44 - lr: 0.000005 - momentum: 0.000000
2023-10-17 00:10:55,420 epoch 1 - iter 154/773 - loss 1.15152095 - time (sec): 8.85 - samples/sec: 2881.70 - lr: 0.000010 - momentum: 0.000000
2023-10-17 00:10:59,884 epoch 1 - iter 231/773 - loss 0.83359397 - time (sec): 13.32 - samples/sec: 2834.07 - lr: 0.000015 - momentum: 0.000000
2023-10-17 00:11:04,318 epoch 1 - iter 308/773 - loss 0.66484530 - time (sec): 17.75 - samples/sec: 2812.38 - lr: 0.000020 - momentum: 0.000000
2023-10-17 00:11:08,928 epoch 1 - iter 385/773 - loss 0.55488029 - time (sec): 22.36 - samples/sec: 2796.56 - lr: 0.000025 - momentum: 0.000000
2023-10-17 00:11:13,398 epoch 1 - iter 462/773 - loss 0.48513473 - time (sec): 26.83 - samples/sec: 2767.76 - lr: 0.000030 - momentum: 0.000000
2023-10-17 00:11:17,743 epoch 1 - iter 539/773 - loss 0.43220579 - time (sec): 31.18 - samples/sec: 2767.71 - lr: 0.000035 - momentum: 0.000000
2023-10-17 00:11:22,171 epoch 1 - iter 616/773 - loss 0.39217814 - time (sec): 35.60 - samples/sec: 2766.34 - lr: 0.000040 - momentum: 0.000000
2023-10-17 00:11:26,934 epoch 1 - iter 693/773 - loss 0.35818872 - time (sec): 40.37 - samples/sec: 2755.31 - lr: 0.000045 - momentum: 0.000000
2023-10-17 00:11:31,429 epoch 1 - iter 770/773 - loss 0.33013406 - time (sec): 44.86 - samples/sec: 2762.25 - lr: 0.000050 - momentum: 0.000000
2023-10-17 00:11:31,575 ----------------------------------------------------------------------------------------------------
2023-10-17 00:11:31,575 EPOCH 1 done: loss 0.3293 - lr: 0.000050
2023-10-17 00:11:33,573 DEV : loss 0.059652842581272125 - f1-score (micro avg) 0.7208
2023-10-17 00:11:33,596 saving best model
2023-10-17 00:11:33,936 ----------------------------------------------------------------------------------------------------
2023-10-17 00:11:38,685 epoch 2 - iter 77/773 - loss 0.08267678 - time (sec): 4.75 - samples/sec: 2784.77 - lr: 0.000049 - momentum: 0.000000
2023-10-17 00:11:43,311 epoch 2 - iter 154/773 - loss 0.08055208 - time (sec): 9.37 - samples/sec: 2754.52 - lr: 0.000049 - momentum: 0.000000
2023-10-17 00:11:47,825 epoch 2 - iter 231/773 - loss 0.08052395 - time (sec): 13.89 - samples/sec: 2743.62 - lr: 0.000048 - momentum: 0.000000
2023-10-17 00:11:52,184 epoch 2 - iter 308/773 - loss 0.08357330 - time (sec): 18.25 - samples/sec: 2741.61 - lr: 0.000048 - momentum: 0.000000
2023-10-17 00:11:56,640 epoch 2 - iter 385/773 - loss 0.08382211 - time (sec): 22.70 - samples/sec: 2709.61 - lr: 0.000047 - momentum: 0.000000
2023-10-17 00:12:01,104 epoch 2 - iter 462/773 - loss 0.08414725 - time (sec): 27.17 - samples/sec: 2725.32 - lr: 0.000047 - momentum: 0.000000
2023-10-17 00:12:05,611 epoch 2 - iter 539/773 - loss 0.08425195 - time (sec): 31.67 - samples/sec: 2738.14 - lr: 0.000046 - momentum: 0.000000
2023-10-17 00:12:10,421 epoch 2 - iter 616/773 - loss 0.08124072 - time (sec): 36.48 - samples/sec: 2732.40 - lr: 0.000046 - momentum: 0.000000
2023-10-17 00:12:14,839 epoch 2 - iter 693/773 - loss 0.08111029 - time (sec): 40.90 - samples/sec: 2724.98 - lr: 0.000045 - momentum: 0.000000
2023-10-17 00:12:19,329 epoch 2 - iter 770/773 - loss 0.08051640 - time (sec): 45.39 - samples/sec: 2731.34 - lr: 0.000044 - momentum: 0.000000
2023-10-17 00:12:19,473 ----------------------------------------------------------------------------------------------------
2023-10-17 00:12:19,473 EPOCH 2 done: loss 0.0805 - lr: 0.000044
2023-10-17 00:12:21,618 DEV : loss 0.06363236904144287 - f1-score (micro avg) 0.7695
2023-10-17 00:12:21,632 saving best model
2023-10-17 00:12:22,081 ----------------------------------------------------------------------------------------------------
2023-10-17 00:12:26,590 epoch 3 - iter 77/773 - loss 0.04146028 - time (sec): 4.51 - samples/sec: 2835.22 - lr: 0.000044 - momentum: 0.000000
2023-10-17 00:12:31,330 epoch 3 - iter 154/773 - loss 0.05242848 - time (sec): 9.25 - samples/sec: 2794.38 - lr: 0.000043 - momentum: 0.000000
2023-10-17 00:12:36,075 epoch 3 - iter 231/773 - loss 0.05020704 - time (sec): 13.99 - samples/sec: 2799.71 - lr: 0.000043 - momentum: 0.000000
2023-10-17 00:12:40,485 epoch 3 - iter 308/773 - loss 0.04913742 - time (sec): 18.40 - samples/sec: 2761.24 - lr: 0.000042 - momentum: 0.000000
2023-10-17 00:12:45,283 epoch 3 - iter 385/773 - loss 0.04825767 - time (sec): 23.20 - samples/sec: 2709.09 - lr: 0.000042 - momentum: 0.000000
2023-10-17 00:12:49,681 epoch 3 - iter 462/773 - loss 0.04909420 - time (sec): 27.60 - samples/sec: 2706.36 - lr: 0.000041 - momentum: 0.000000
2023-10-17 00:12:54,373 epoch 3 - iter 539/773 - loss 0.05056346 - time (sec): 32.29 - samples/sec: 2716.55 - lr: 0.000041 - momentum: 0.000000
2023-10-17 00:12:58,913 epoch 3 - iter 616/773 - loss 0.04920383 - time (sec): 36.83 - samples/sec: 2711.79 - lr: 0.000040 - momentum: 0.000000
2023-10-17 00:13:03,233 epoch 3 - iter 693/773 - loss 0.04950107 - time (sec): 41.15 - samples/sec: 2708.02 - lr: 0.000039 - momentum: 0.000000
2023-10-17 00:13:07,711 epoch 3 - iter 770/773 - loss 0.04922958 - time (sec): 45.63 - samples/sec: 2714.93 - lr: 0.000039 - momentum: 0.000000
2023-10-17 00:13:07,859 ----------------------------------------------------------------------------------------------------
2023-10-17 00:13:07,859 EPOCH 3 done: loss 0.0491 - lr: 0.000039
2023-10-17 00:13:09,920 DEV : loss 0.07400500774383545 - f1-score (micro avg) 0.7557
2023-10-17 00:13:09,933 ----------------------------------------------------------------------------------------------------
2023-10-17 00:13:14,277 epoch 4 - iter 77/773 - loss 0.03358044 - time (sec): 4.34 - samples/sec: 2682.63 - lr: 0.000038 - momentum: 0.000000
2023-10-17 00:13:18,839 epoch 4 - iter 154/773 - loss 0.03101342 - time (sec): 8.91 - samples/sec: 2657.51 - lr: 0.000038 - momentum: 0.000000
2023-10-17 00:13:23,406 epoch 4 - iter 231/773 - loss 0.04019504 - time (sec): 13.47 - samples/sec: 2696.83 - lr: 0.000037 - momentum: 0.000000
2023-10-17 00:13:27,935 epoch 4 - iter 308/773 - loss 0.03771869 - time (sec): 18.00 - samples/sec: 2698.58 - lr: 0.000037 - momentum: 0.000000
2023-10-17 00:13:32,515 epoch 4 - iter 385/773 - loss 0.03809639 - time (sec): 22.58 - samples/sec: 2693.07 - lr: 0.000036 - momentum: 0.000000
2023-10-17 00:13:36,925 epoch 4 - iter 462/773 - loss 0.03778033 - time (sec): 26.99 - samples/sec: 2697.29 - lr: 0.000036 - momentum: 0.000000
2023-10-17 00:13:41,472 epoch 4 - iter 539/773 - loss 0.03704712 - time (sec): 31.54 - samples/sec: 2709.38 - lr: 0.000035 - momentum: 0.000000
2023-10-17 00:13:45,999 epoch 4 - iter 616/773 - loss 0.03811842 - time (sec): 36.07 - samples/sec: 2708.41 - lr: 0.000034 - momentum: 0.000000
2023-10-17 00:13:50,448 epoch 4 - iter 693/773 - loss 0.03785183 - time (sec): 40.51 - samples/sec: 2725.89 - lr: 0.000034 - momentum: 0.000000
2023-10-17 00:13:55,203 epoch 4 - iter 770/773 - loss 0.03661737 - time (sec): 45.27 - samples/sec: 2732.01 - lr: 0.000033 - momentum: 0.000000
2023-10-17 00:13:55,384 ----------------------------------------------------------------------------------------------------
2023-10-17 00:13:55,384 EPOCH 4 done: loss 0.0369 - lr: 0.000033
2023-10-17 00:13:57,425 DEV : loss 0.094989113509655 - f1-score (micro avg) 0.7735
2023-10-17 00:13:57,438 saving best model
2023-10-17 00:13:57,905 ----------------------------------------------------------------------------------------------------
2023-10-17 00:14:02,360 epoch 5 - iter 77/773 - loss 0.03472163 - time (sec): 4.45 - samples/sec: 2784.21 - lr: 0.000033 - momentum: 0.000000
2023-10-17 00:14:06,806 epoch 5 - iter 154/773 - loss 0.03021440 - time (sec): 8.89 - samples/sec: 2742.09 - lr: 0.000032 - momentum: 0.000000
2023-10-17 00:14:11,196 epoch 5 - iter 231/773 - loss 0.02772660 - time (sec): 13.28 - samples/sec: 2772.06 - lr: 0.000032 - momentum: 0.000000
2023-10-17 00:14:15,511 epoch 5 - iter 308/773 - loss 0.02631770 - time (sec): 17.60 - samples/sec: 2794.08 - lr: 0.000031 - momentum: 0.000000
2023-10-17 00:14:20,117 epoch 5 - iter 385/773 - loss 0.02565861 - time (sec): 22.20 - samples/sec: 2794.33 - lr: 0.000031 - momentum: 0.000000
2023-10-17 00:14:24,603 epoch 5 - iter 462/773 - loss 0.02740935 - time (sec): 26.69 - samples/sec: 2773.42 - lr: 0.000030 - momentum: 0.000000
2023-10-17 00:14:29,099 epoch 5 - iter 539/773 - loss 0.02750473 - time (sec): 31.19 - samples/sec: 2807.62 - lr: 0.000029 - momentum: 0.000000
2023-10-17 00:14:33,457 epoch 5 - iter 616/773 - loss 0.02705987 - time (sec): 35.54 - samples/sec: 2797.03 - lr: 0.000029 - momentum: 0.000000
2023-10-17 00:14:37,868 epoch 5 - iter 693/773 - loss 0.02728248 - time (sec): 39.96 - samples/sec: 2796.81 - lr: 0.000028 - momentum: 0.000000
2023-10-17 00:14:42,389 epoch 5 - iter 770/773 - loss 0.02673116 - time (sec): 44.48 - samples/sec: 2787.57 - lr: 0.000028 - momentum: 0.000000
2023-10-17 00:14:42,531 ----------------------------------------------------------------------------------------------------
2023-10-17 00:14:42,532 EPOCH 5 done: loss 0.0267 - lr: 0.000028
2023-10-17 00:14:44,574 DEV : loss 0.08982112258672714 - f1-score (micro avg) 0.7782
2023-10-17 00:14:44,587 saving best model
2023-10-17 00:14:45,034 ----------------------------------------------------------------------------------------------------
2023-10-17 00:14:49,458 epoch 6 - iter 77/773 - loss 0.01860363 - time (sec): 4.42 - samples/sec: 2848.08 - lr: 0.000027 - momentum: 0.000000
2023-10-17 00:14:54,100 epoch 6 - iter 154/773 - loss 0.02010732 - time (sec): 9.06 - samples/sec: 2800.03 - lr: 0.000027 - momentum: 0.000000
2023-10-17 00:14:58,610 epoch 6 - iter 231/773 - loss 0.02089781 - time (sec): 13.57 - samples/sec: 2737.15 - lr: 0.000026 - momentum: 0.000000
2023-10-17 00:15:02,982 epoch 6 - iter 308/773 - loss 0.02145581 - time (sec): 17.95 - samples/sec: 2766.88 - lr: 0.000026 - momentum: 0.000000
2023-10-17 00:15:07,619 epoch 6 - iter 385/773 - loss 0.02124064 - time (sec): 22.58 - samples/sec: 2763.84 - lr: 0.000025 - momentum: 0.000000
2023-10-17 00:15:12,187 epoch 6 - iter 462/773 - loss 0.02044592 - time (sec): 27.15 - samples/sec: 2735.29 - lr: 0.000024 - momentum: 0.000000
2023-10-17 00:15:16,805 epoch 6 - iter 539/773 - loss 0.01854573 - time (sec): 31.77 - samples/sec: 2735.71 - lr: 0.000024 - momentum: 0.000000
2023-10-17 00:15:21,104 epoch 6 - iter 616/773 - loss 0.01837402 - time (sec): 36.07 - samples/sec: 2709.40 - lr: 0.000023 - momentum: 0.000000
2023-10-17 00:15:25,786 epoch 6 - iter 693/773 - loss 0.01805303 - time (sec): 40.75 - samples/sec: 2715.50 - lr: 0.000023 - momentum: 0.000000
2023-10-17 00:15:30,367 epoch 6 - iter 770/773 - loss 0.01812580 - time (sec): 45.33 - samples/sec: 2734.73 - lr: 0.000022 - momentum: 0.000000
2023-10-17 00:15:30,515 ----------------------------------------------------------------------------------------------------
2023-10-17 00:15:30,515 EPOCH 6 done: loss 0.0182 - lr: 0.000022
2023-10-17 00:15:32,896 DEV : loss 0.10503190755844116 - f1-score (micro avg) 0.7828
2023-10-17 00:15:32,909 saving best model
2023-10-17 00:15:33,359 ----------------------------------------------------------------------------------------------------
2023-10-17 00:15:37,736 epoch 7 - iter 77/773 - loss 0.01183332 - time (sec): 4.38 - samples/sec: 2641.89 - lr: 0.000022 - momentum: 0.000000
2023-10-17 00:15:42,135 epoch 7 - iter 154/773 - loss 0.01209373 - time (sec): 8.78 - samples/sec: 2648.17 - lr: 0.000021 - momentum: 0.000000
2023-10-17 00:15:46,560 epoch 7 - iter 231/773 - loss 0.01195452 - time (sec): 13.20 - samples/sec: 2661.44 - lr: 0.000021 - momentum: 0.000000
2023-10-17 00:15:51,297 epoch 7 - iter 308/773 - loss 0.01287340 - time (sec): 17.94 - samples/sec: 2678.24 - lr: 0.000020 - momentum: 0.000000
2023-10-17 00:15:56,031 epoch 7 - iter 385/773 - loss 0.01293459 - time (sec): 22.67 - samples/sec: 2684.40 - lr: 0.000019 - momentum: 0.000000
2023-10-17 00:16:00,860 epoch 7 - iter 462/773 - loss 0.01236694 - time (sec): 27.50 - samples/sec: 2680.71 - lr: 0.000019 - momentum: 0.000000
2023-10-17 00:16:05,350 epoch 7 - iter 539/773 - loss 0.01324423 - time (sec): 31.99 - samples/sec: 2716.70 - lr: 0.000018 - momentum: 0.000000
2023-10-17 00:16:09,729 epoch 7 - iter 616/773 - loss 0.01303248 - time (sec): 36.37 - samples/sec: 2732.97 - lr: 0.000018 - momentum: 0.000000
2023-10-17 00:16:14,130 epoch 7 - iter 693/773 - loss 0.01280785 - time (sec): 40.77 - samples/sec: 2728.98 - lr: 0.000017 - momentum: 0.000000
2023-10-17 00:16:18,632 epoch 7 - iter 770/773 - loss 0.01284700 - time (sec): 45.27 - samples/sec: 2736.07 - lr: 0.000017 - momentum: 0.000000
2023-10-17 00:16:18,807 ----------------------------------------------------------------------------------------------------
2023-10-17 00:16:18,807 EPOCH 7 done: loss 0.0129 - lr: 0.000017
2023-10-17 00:16:20,898 DEV : loss 0.11371700465679169 - f1-score (micro avg) 0.7819
2023-10-17 00:16:20,911 ----------------------------------------------------------------------------------------------------
2023-10-17 00:16:25,476 epoch 8 - iter 77/773 - loss 0.01156935 - time (sec): 4.56 - samples/sec: 2709.97 - lr: 0.000016 - momentum: 0.000000
2023-10-17 00:16:30,132 epoch 8 - iter 154/773 - loss 0.01249568 - time (sec): 9.22 - samples/sec: 2800.97 - lr: 0.000016 - momentum: 0.000000
2023-10-17 00:16:34,605 epoch 8 - iter 231/773 - loss 0.01229927 - time (sec): 13.69 - samples/sec: 2766.08 - lr: 0.000015 - momentum: 0.000000
2023-10-17 00:16:39,121 epoch 8 - iter 308/773 - loss 0.01066091 - time (sec): 18.21 - samples/sec: 2785.15 - lr: 0.000014 - momentum: 0.000000
2023-10-17 00:16:43,831 epoch 8 - iter 385/773 - loss 0.00972507 - time (sec): 22.92 - samples/sec: 2770.43 - lr: 0.000014 - momentum: 0.000000
2023-10-17 00:16:48,553 epoch 8 - iter 462/773 - loss 0.00980739 - time (sec): 27.64 - samples/sec: 2766.62 - lr: 0.000013 - momentum: 0.000000
2023-10-17 00:16:53,046 epoch 8 - iter 539/773 - loss 0.01020468 - time (sec): 32.13 - samples/sec: 2739.83 - lr: 0.000013 - momentum: 0.000000
2023-10-17 00:16:57,396 epoch 8 - iter 616/773 - loss 0.01063618 - time (sec): 36.48 - samples/sec: 2725.94 - lr: 0.000012 - momentum: 0.000000
2023-10-17 00:17:01,933 epoch 8 - iter 693/773 - loss 0.01072864 - time (sec): 41.02 - samples/sec: 2734.32 - lr: 0.000012 - momentum: 0.000000
2023-10-17 00:17:06,292 epoch 8 - iter 770/773 - loss 0.01043384 - time (sec): 45.38 - samples/sec: 2729.16 - lr: 0.000011 - momentum: 0.000000
2023-10-17 00:17:06,450 ----------------------------------------------------------------------------------------------------
2023-10-17 00:17:06,450 EPOCH 8 done: loss 0.0104 - lr: 0.000011
2023-10-17 00:17:08,525 DEV : loss 0.11392305791378021 - f1-score (micro avg) 0.7769
2023-10-17 00:17:08,537 ----------------------------------------------------------------------------------------------------
2023-10-17 00:17:12,969 epoch 9 - iter 77/773 - loss 0.00904434 - time (sec): 4.43 - samples/sec: 2761.63 - lr: 0.000011 - momentum: 0.000000
2023-10-17 00:17:17,472 epoch 9 - iter 154/773 - loss 0.00780158 - time (sec): 8.93 - samples/sec: 2828.74 - lr: 0.000010 - momentum: 0.000000
2023-10-17 00:17:22,301 epoch 9 - iter 231/773 - loss 0.00701070 - time (sec): 13.76 - samples/sec: 2789.60 - lr: 0.000009 - momentum: 0.000000
2023-10-17 00:17:26,710 epoch 9 - iter 308/773 - loss 0.00763945 - time (sec): 18.17 - samples/sec: 2774.88 - lr: 0.000009 - momentum: 0.000000
2023-10-17 00:17:31,242 epoch 9 - iter 385/773 - loss 0.00690755 - time (sec): 22.70 - samples/sec: 2761.91 - lr: 0.000008 - momentum: 0.000000
2023-10-17 00:17:35,850 epoch 9 - iter 462/773 - loss 0.00645291 - time (sec): 27.31 - samples/sec: 2743.64 - lr: 0.000008 - momentum: 0.000000
2023-10-17 00:17:40,446 epoch 9 - iter 539/773 - loss 0.00573275 - time (sec): 31.91 - samples/sec: 2708.02 - lr: 0.000007 - momentum: 0.000000
2023-10-17 00:17:44,926 epoch 9 - iter 616/773 - loss 0.00611407 - time (sec): 36.39 - samples/sec: 2725.77 - lr: 0.000007 - momentum: 0.000000
2023-10-17 00:17:49,539 epoch 9 - iter 693/773 - loss 0.00629245 - time (sec): 41.00 - samples/sec: 2723.65 - lr: 0.000006 - momentum: 0.000000
2023-10-17 00:17:53,916 epoch 9 - iter 770/773 - loss 0.00650414 - time (sec): 45.38 - samples/sec: 2726.62 - lr: 0.000006 - momentum: 0.000000
2023-10-17 00:17:54,091 ----------------------------------------------------------------------------------------------------
2023-10-17 00:17:54,091 EPOCH 9 done: loss 0.0065 - lr: 0.000006
2023-10-17 00:17:56,158 DEV : loss 0.12053602188825607 - f1-score (micro avg) 0.7826
2023-10-17 00:17:56,171 ----------------------------------------------------------------------------------------------------
2023-10-17 00:18:00,927 epoch 10 - iter 77/773 - loss 0.00599831 - time (sec): 4.75 - samples/sec: 2645.45 - lr: 0.000005 - momentum: 0.000000
2023-10-17 00:18:05,488 epoch 10 - iter 154/773 - loss 0.00506362 - time (sec): 9.32 - samples/sec: 2681.78 - lr: 0.000005 - momentum: 0.000000
2023-10-17 00:18:09,735 epoch 10 - iter 231/773 - loss 0.00564116 - time (sec): 13.56 - samples/sec: 2681.17 - lr: 0.000004 - momentum: 0.000000
2023-10-17 00:18:14,369 epoch 10 - iter 308/773 - loss 0.00511353 - time (sec): 18.20 - samples/sec: 2704.55 - lr: 0.000003 - momentum: 0.000000
2023-10-17 00:18:18,980 epoch 10 - iter 385/773 - loss 0.00477008 - time (sec): 22.81 - samples/sec: 2745.02 - lr: 0.000003 - momentum: 0.000000
2023-10-17 00:18:23,547 epoch 10 - iter 462/773 - loss 0.00429130 - time (sec): 27.38 - samples/sec: 2766.04 - lr: 0.000002 - momentum: 0.000000
2023-10-17 00:18:28,025 epoch 10 - iter 539/773 - loss 0.00437021 - time (sec): 31.85 - samples/sec: 2763.06 - lr: 0.000002 - momentum: 0.000000
2023-10-17 00:18:32,366 epoch 10 - iter 616/773 - loss 0.00440371 - time (sec): 36.19 - samples/sec: 2751.71 - lr: 0.000001 - momentum: 0.000000
2023-10-17 00:18:36,768 epoch 10 - iter 693/773 - loss 0.00478785 - time (sec): 40.60 - samples/sec: 2756.28 - lr: 0.000001 - momentum: 0.000000
2023-10-17 00:18:41,261 epoch 10 - iter 770/773 - loss 0.00462015 - time (sec): 45.09 - samples/sec: 2746.82 - lr: 0.000000 - momentum: 0.000000
2023-10-17 00:18:41,417 ----------------------------------------------------------------------------------------------------
2023-10-17 00:18:41,417 EPOCH 10 done: loss 0.0046 - lr: 0.000000
2023-10-17 00:18:43,468 DEV : loss 0.11875911056995392 - f1-score (micro avg) 0.7789
2023-10-17 00:18:43,819 ----------------------------------------------------------------------------------------------------
2023-10-17 00:18:43,821 Loading model from best epoch ...
2023-10-17 00:18:45,650 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-17 00:18:51,351
Results:
- F-score (micro) 0.8017
- F-score (macro) 0.7056
- Accuracy 0.6963
By class:
precision recall f1-score support
LOC 0.8471 0.8436 0.8453 946
BUILDING 0.6049 0.6703 0.6359 185
STREET 0.6667 0.6071 0.6355 56
micro avg 0.7980 0.8054 0.8017 1187
macro avg 0.7062 0.7070 0.7056 1187
weighted avg 0.8009 0.8054 0.8028 1187
2023-10-17 00:18:51,351 ----------------------------------------------------------------------------------------------------