stefan-it's picture
Upload folder using huggingface_hub
c8a3d7f
2023-10-16 19:55:09,277 ----------------------------------------------------------------------------------------------------
2023-10-16 19:55:09,278 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 19:55:09,278 ----------------------------------------------------------------------------------------------------
2023-10-16 19:55:09,278 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-16 19:55:09,278 ----------------------------------------------------------------------------------------------------
2023-10-16 19:55:09,278 Train: 1085 sentences
2023-10-16 19:55:09,278 (train_with_dev=False, train_with_test=False)
2023-10-16 19:55:09,278 ----------------------------------------------------------------------------------------------------
2023-10-16 19:55:09,278 Training Params:
2023-10-16 19:55:09,278 - learning_rate: "3e-05"
2023-10-16 19:55:09,278 - mini_batch_size: "4"
2023-10-16 19:55:09,278 - max_epochs: "10"
2023-10-16 19:55:09,278 - shuffle: "True"
2023-10-16 19:55:09,278 ----------------------------------------------------------------------------------------------------
2023-10-16 19:55:09,278 Plugins:
2023-10-16 19:55:09,278 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 19:55:09,278 ----------------------------------------------------------------------------------------------------
2023-10-16 19:55:09,278 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 19:55:09,278 - metric: "('micro avg', 'f1-score')"
2023-10-16 19:55:09,279 ----------------------------------------------------------------------------------------------------
2023-10-16 19:55:09,279 Computation:
2023-10-16 19:55:09,279 - compute on device: cuda:0
2023-10-16 19:55:09,279 - embedding storage: none
2023-10-16 19:55:09,279 ----------------------------------------------------------------------------------------------------
2023-10-16 19:55:09,279 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-16 19:55:09,279 ----------------------------------------------------------------------------------------------------
2023-10-16 19:55:09,279 ----------------------------------------------------------------------------------------------------
2023-10-16 19:55:10,798 epoch 1 - iter 27/272 - loss 2.83234957 - time (sec): 1.52 - samples/sec: 3290.84 - lr: 0.000003 - momentum: 0.000000
2023-10-16 19:55:12,264 epoch 1 - iter 54/272 - loss 2.43687139 - time (sec): 2.98 - samples/sec: 3407.94 - lr: 0.000006 - momentum: 0.000000
2023-10-16 19:55:13,757 epoch 1 - iter 81/272 - loss 1.87521019 - time (sec): 4.48 - samples/sec: 3379.15 - lr: 0.000009 - momentum: 0.000000
2023-10-16 19:55:15,248 epoch 1 - iter 108/272 - loss 1.54195388 - time (sec): 5.97 - samples/sec: 3360.98 - lr: 0.000012 - momentum: 0.000000
2023-10-16 19:55:16,670 epoch 1 - iter 135/272 - loss 1.34849777 - time (sec): 7.39 - samples/sec: 3310.21 - lr: 0.000015 - momentum: 0.000000
2023-10-16 19:55:18,237 epoch 1 - iter 162/272 - loss 1.16896217 - time (sec): 8.96 - samples/sec: 3315.75 - lr: 0.000018 - momentum: 0.000000
2023-10-16 19:55:19,779 epoch 1 - iter 189/272 - loss 1.04849700 - time (sec): 10.50 - samples/sec: 3299.64 - lr: 0.000021 - momentum: 0.000000
2023-10-16 19:55:21,259 epoch 1 - iter 216/272 - loss 0.95395801 - time (sec): 11.98 - samples/sec: 3306.92 - lr: 0.000024 - momentum: 0.000000
2023-10-16 19:55:22,843 epoch 1 - iter 243/272 - loss 0.87116032 - time (sec): 13.56 - samples/sec: 3302.68 - lr: 0.000027 - momentum: 0.000000
2023-10-16 19:55:24,711 epoch 1 - iter 270/272 - loss 0.77807401 - time (sec): 15.43 - samples/sec: 3356.32 - lr: 0.000030 - momentum: 0.000000
2023-10-16 19:55:24,812 ----------------------------------------------------------------------------------------------------
2023-10-16 19:55:24,812 EPOCH 1 done: loss 0.7756 - lr: 0.000030
2023-10-16 19:55:25,806 DEV : loss 0.16201870143413544 - f1-score (micro avg) 0.5989
2023-10-16 19:55:25,810 saving best model
2023-10-16 19:55:26,150 ----------------------------------------------------------------------------------------------------
2023-10-16 19:55:27,855 epoch 2 - iter 27/272 - loss 0.16691729 - time (sec): 1.70 - samples/sec: 3558.22 - lr: 0.000030 - momentum: 0.000000
2023-10-16 19:55:29,354 epoch 2 - iter 54/272 - loss 0.18391069 - time (sec): 3.20 - samples/sec: 3606.93 - lr: 0.000029 - momentum: 0.000000
2023-10-16 19:55:30,864 epoch 2 - iter 81/272 - loss 0.18206767 - time (sec): 4.71 - samples/sec: 3430.44 - lr: 0.000029 - momentum: 0.000000
2023-10-16 19:55:32,454 epoch 2 - iter 108/272 - loss 0.17658993 - time (sec): 6.30 - samples/sec: 3436.87 - lr: 0.000029 - momentum: 0.000000
2023-10-16 19:55:33,900 epoch 2 - iter 135/272 - loss 0.16868206 - time (sec): 7.75 - samples/sec: 3408.36 - lr: 0.000028 - momentum: 0.000000
2023-10-16 19:55:35,478 epoch 2 - iter 162/272 - loss 0.16939540 - time (sec): 9.33 - samples/sec: 3362.54 - lr: 0.000028 - momentum: 0.000000
2023-10-16 19:55:37,031 epoch 2 - iter 189/272 - loss 0.16409239 - time (sec): 10.88 - samples/sec: 3355.17 - lr: 0.000028 - momentum: 0.000000
2023-10-16 19:55:38,595 epoch 2 - iter 216/272 - loss 0.16010105 - time (sec): 12.44 - samples/sec: 3376.29 - lr: 0.000027 - momentum: 0.000000
2023-10-16 19:55:39,951 epoch 2 - iter 243/272 - loss 0.15825024 - time (sec): 13.80 - samples/sec: 3338.77 - lr: 0.000027 - momentum: 0.000000
2023-10-16 19:55:41,608 epoch 2 - iter 270/272 - loss 0.15185793 - time (sec): 15.46 - samples/sec: 3353.38 - lr: 0.000027 - momentum: 0.000000
2023-10-16 19:55:41,692 ----------------------------------------------------------------------------------------------------
2023-10-16 19:55:41,692 EPOCH 2 done: loss 0.1516 - lr: 0.000027
2023-10-16 19:55:43,108 DEV : loss 0.11580488085746765 - f1-score (micro avg) 0.784
2023-10-16 19:55:43,112 saving best model
2023-10-16 19:55:43,564 ----------------------------------------------------------------------------------------------------
2023-10-16 19:55:45,007 epoch 3 - iter 27/272 - loss 0.08808373 - time (sec): 1.44 - samples/sec: 3170.05 - lr: 0.000026 - momentum: 0.000000
2023-10-16 19:55:46,648 epoch 3 - iter 54/272 - loss 0.09481473 - time (sec): 3.08 - samples/sec: 3413.77 - lr: 0.000026 - momentum: 0.000000
2023-10-16 19:55:48,227 epoch 3 - iter 81/272 - loss 0.10163387 - time (sec): 4.66 - samples/sec: 3434.61 - lr: 0.000026 - momentum: 0.000000
2023-10-16 19:55:49,795 epoch 3 - iter 108/272 - loss 0.09866374 - time (sec): 6.23 - samples/sec: 3487.16 - lr: 0.000025 - momentum: 0.000000
2023-10-16 19:55:51,436 epoch 3 - iter 135/272 - loss 0.09301795 - time (sec): 7.87 - samples/sec: 3426.99 - lr: 0.000025 - momentum: 0.000000
2023-10-16 19:55:53,033 epoch 3 - iter 162/272 - loss 0.09724660 - time (sec): 9.47 - samples/sec: 3404.26 - lr: 0.000025 - momentum: 0.000000
2023-10-16 19:55:54,679 epoch 3 - iter 189/272 - loss 0.09184127 - time (sec): 11.11 - samples/sec: 3375.12 - lr: 0.000024 - momentum: 0.000000
2023-10-16 19:55:56,293 epoch 3 - iter 216/272 - loss 0.09024370 - time (sec): 12.72 - samples/sec: 3366.13 - lr: 0.000024 - momentum: 0.000000
2023-10-16 19:55:57,711 epoch 3 - iter 243/272 - loss 0.08931322 - time (sec): 14.14 - samples/sec: 3314.10 - lr: 0.000024 - momentum: 0.000000
2023-10-16 19:55:59,194 epoch 3 - iter 270/272 - loss 0.08847740 - time (sec): 15.63 - samples/sec: 3318.50 - lr: 0.000023 - momentum: 0.000000
2023-10-16 19:55:59,280 ----------------------------------------------------------------------------------------------------
2023-10-16 19:55:59,280 EPOCH 3 done: loss 0.0888 - lr: 0.000023
2023-10-16 19:56:00,703 DEV : loss 0.10064025223255157 - f1-score (micro avg) 0.7821
2023-10-16 19:56:00,707 ----------------------------------------------------------------------------------------------------
2023-10-16 19:56:02,368 epoch 4 - iter 27/272 - loss 0.03703618 - time (sec): 1.66 - samples/sec: 3340.49 - lr: 0.000023 - momentum: 0.000000
2023-10-16 19:56:04,136 epoch 4 - iter 54/272 - loss 0.04293959 - time (sec): 3.43 - samples/sec: 3405.06 - lr: 0.000023 - momentum: 0.000000
2023-10-16 19:56:05,592 epoch 4 - iter 81/272 - loss 0.04695382 - time (sec): 4.88 - samples/sec: 3376.98 - lr: 0.000022 - momentum: 0.000000
2023-10-16 19:56:07,075 epoch 4 - iter 108/272 - loss 0.05388166 - time (sec): 6.37 - samples/sec: 3376.09 - lr: 0.000022 - momentum: 0.000000
2023-10-16 19:56:08,545 epoch 4 - iter 135/272 - loss 0.05558070 - time (sec): 7.84 - samples/sec: 3384.14 - lr: 0.000022 - momentum: 0.000000
2023-10-16 19:56:10,279 epoch 4 - iter 162/272 - loss 0.05373334 - time (sec): 9.57 - samples/sec: 3268.66 - lr: 0.000021 - momentum: 0.000000
2023-10-16 19:56:11,860 epoch 4 - iter 189/272 - loss 0.05246304 - time (sec): 11.15 - samples/sec: 3309.54 - lr: 0.000021 - momentum: 0.000000
2023-10-16 19:56:13,335 epoch 4 - iter 216/272 - loss 0.05390096 - time (sec): 12.63 - samples/sec: 3265.12 - lr: 0.000021 - momentum: 0.000000
2023-10-16 19:56:14,867 epoch 4 - iter 243/272 - loss 0.05448464 - time (sec): 14.16 - samples/sec: 3269.11 - lr: 0.000020 - momentum: 0.000000
2023-10-16 19:56:16,429 epoch 4 - iter 270/272 - loss 0.05501421 - time (sec): 15.72 - samples/sec: 3274.45 - lr: 0.000020 - momentum: 0.000000
2023-10-16 19:56:16,551 ----------------------------------------------------------------------------------------------------
2023-10-16 19:56:16,551 EPOCH 4 done: loss 0.0547 - lr: 0.000020
2023-10-16 19:56:18,001 DEV : loss 0.11307715624570847 - f1-score (micro avg) 0.8199
2023-10-16 19:56:18,005 saving best model
2023-10-16 19:56:18,442 ----------------------------------------------------------------------------------------------------
2023-10-16 19:56:19,980 epoch 5 - iter 27/272 - loss 0.04691907 - time (sec): 1.54 - samples/sec: 3349.16 - lr: 0.000020 - momentum: 0.000000
2023-10-16 19:56:21,422 epoch 5 - iter 54/272 - loss 0.04225722 - time (sec): 2.98 - samples/sec: 3282.75 - lr: 0.000019 - momentum: 0.000000
2023-10-16 19:56:22,868 epoch 5 - iter 81/272 - loss 0.04022072 - time (sec): 4.43 - samples/sec: 3258.89 - lr: 0.000019 - momentum: 0.000000
2023-10-16 19:56:24,349 epoch 5 - iter 108/272 - loss 0.03973764 - time (sec): 5.91 - samples/sec: 3269.21 - lr: 0.000019 - momentum: 0.000000
2023-10-16 19:56:25,917 epoch 5 - iter 135/272 - loss 0.03914369 - time (sec): 7.47 - samples/sec: 3244.34 - lr: 0.000018 - momentum: 0.000000
2023-10-16 19:56:27,596 epoch 5 - iter 162/272 - loss 0.03750230 - time (sec): 9.15 - samples/sec: 3243.35 - lr: 0.000018 - momentum: 0.000000
2023-10-16 19:56:29,363 epoch 5 - iter 189/272 - loss 0.03658967 - time (sec): 10.92 - samples/sec: 3281.76 - lr: 0.000018 - momentum: 0.000000
2023-10-16 19:56:30,973 epoch 5 - iter 216/272 - loss 0.03680208 - time (sec): 12.53 - samples/sec: 3305.68 - lr: 0.000017 - momentum: 0.000000
2023-10-16 19:56:32,478 epoch 5 - iter 243/272 - loss 0.03570734 - time (sec): 14.03 - samples/sec: 3283.66 - lr: 0.000017 - momentum: 0.000000
2023-10-16 19:56:34,142 epoch 5 - iter 270/272 - loss 0.03533262 - time (sec): 15.70 - samples/sec: 3287.98 - lr: 0.000017 - momentum: 0.000000
2023-10-16 19:56:34,276 ----------------------------------------------------------------------------------------------------
2023-10-16 19:56:34,276 EPOCH 5 done: loss 0.0353 - lr: 0.000017
2023-10-16 19:56:35,722 DEV : loss 0.12297820299863815 - f1-score (micro avg) 0.8154
2023-10-16 19:56:35,727 ----------------------------------------------------------------------------------------------------
2023-10-16 19:56:37,444 epoch 6 - iter 27/272 - loss 0.01991166 - time (sec): 1.72 - samples/sec: 3322.31 - lr: 0.000016 - momentum: 0.000000
2023-10-16 19:56:38,878 epoch 6 - iter 54/272 - loss 0.01908545 - time (sec): 3.15 - samples/sec: 3148.40 - lr: 0.000016 - momentum: 0.000000
2023-10-16 19:56:40,414 epoch 6 - iter 81/272 - loss 0.02105140 - time (sec): 4.69 - samples/sec: 3163.24 - lr: 0.000016 - momentum: 0.000000
2023-10-16 19:56:41,821 epoch 6 - iter 108/272 - loss 0.02398416 - time (sec): 6.09 - samples/sec: 3170.19 - lr: 0.000015 - momentum: 0.000000
2023-10-16 19:56:43,253 epoch 6 - iter 135/272 - loss 0.02455143 - time (sec): 7.53 - samples/sec: 3197.84 - lr: 0.000015 - momentum: 0.000000
2023-10-16 19:56:44,862 epoch 6 - iter 162/272 - loss 0.02537861 - time (sec): 9.13 - samples/sec: 3200.04 - lr: 0.000015 - momentum: 0.000000
2023-10-16 19:56:46,456 epoch 6 - iter 189/272 - loss 0.02638712 - time (sec): 10.73 - samples/sec: 3231.46 - lr: 0.000014 - momentum: 0.000000
2023-10-16 19:56:48,089 epoch 6 - iter 216/272 - loss 0.02536443 - time (sec): 12.36 - samples/sec: 3289.75 - lr: 0.000014 - momentum: 0.000000
2023-10-16 19:56:49,795 epoch 6 - iter 243/272 - loss 0.02486984 - time (sec): 14.07 - samples/sec: 3329.80 - lr: 0.000014 - momentum: 0.000000
2023-10-16 19:56:51,361 epoch 6 - iter 270/272 - loss 0.02422424 - time (sec): 15.63 - samples/sec: 3310.76 - lr: 0.000013 - momentum: 0.000000
2023-10-16 19:56:51,454 ----------------------------------------------------------------------------------------------------
2023-10-16 19:56:51,455 EPOCH 6 done: loss 0.0248 - lr: 0.000013
2023-10-16 19:56:52,874 DEV : loss 0.12834559381008148 - f1-score (micro avg) 0.8392
2023-10-16 19:56:52,878 saving best model
2023-10-16 19:56:53,283 ----------------------------------------------------------------------------------------------------
2023-10-16 19:56:54,889 epoch 7 - iter 27/272 - loss 0.01509354 - time (sec): 1.60 - samples/sec: 3342.76 - lr: 0.000013 - momentum: 0.000000
2023-10-16 19:56:56,559 epoch 7 - iter 54/272 - loss 0.02142213 - time (sec): 3.27 - samples/sec: 3407.39 - lr: 0.000013 - momentum: 0.000000
2023-10-16 19:56:58,145 epoch 7 - iter 81/272 - loss 0.01832991 - time (sec): 4.86 - samples/sec: 3396.16 - lr: 0.000012 - momentum: 0.000000
2023-10-16 19:56:59,710 epoch 7 - iter 108/272 - loss 0.01959823 - time (sec): 6.42 - samples/sec: 3400.56 - lr: 0.000012 - momentum: 0.000000
2023-10-16 19:57:01,243 epoch 7 - iter 135/272 - loss 0.01831873 - time (sec): 7.95 - samples/sec: 3428.75 - lr: 0.000012 - momentum: 0.000000
2023-10-16 19:57:02,746 epoch 7 - iter 162/272 - loss 0.02072289 - time (sec): 9.46 - samples/sec: 3380.77 - lr: 0.000011 - momentum: 0.000000
2023-10-16 19:57:04,279 epoch 7 - iter 189/272 - loss 0.01977148 - time (sec): 10.99 - samples/sec: 3358.94 - lr: 0.000011 - momentum: 0.000000
2023-10-16 19:57:05,886 epoch 7 - iter 216/272 - loss 0.01911425 - time (sec): 12.60 - samples/sec: 3368.28 - lr: 0.000011 - momentum: 0.000000
2023-10-16 19:57:07,420 epoch 7 - iter 243/272 - loss 0.01880223 - time (sec): 14.13 - samples/sec: 3359.24 - lr: 0.000010 - momentum: 0.000000
2023-10-16 19:57:08,916 epoch 7 - iter 270/272 - loss 0.02047643 - time (sec): 15.63 - samples/sec: 3318.25 - lr: 0.000010 - momentum: 0.000000
2023-10-16 19:57:09,000 ----------------------------------------------------------------------------------------------------
2023-10-16 19:57:09,000 EPOCH 7 done: loss 0.0205 - lr: 0.000010
2023-10-16 19:57:10,600 DEV : loss 0.1370716243982315 - f1-score (micro avg) 0.8439
2023-10-16 19:57:10,604 saving best model
2023-10-16 19:57:11,024 ----------------------------------------------------------------------------------------------------
2023-10-16 19:57:12,531 epoch 8 - iter 27/272 - loss 0.01570058 - time (sec): 1.51 - samples/sec: 3260.07 - lr: 0.000010 - momentum: 0.000000
2023-10-16 19:57:14,135 epoch 8 - iter 54/272 - loss 0.01138672 - time (sec): 3.11 - samples/sec: 3476.00 - lr: 0.000009 - momentum: 0.000000
2023-10-16 19:57:15,765 epoch 8 - iter 81/272 - loss 0.01524252 - time (sec): 4.74 - samples/sec: 3507.80 - lr: 0.000009 - momentum: 0.000000
2023-10-16 19:57:17,378 epoch 8 - iter 108/272 - loss 0.01459537 - time (sec): 6.35 - samples/sec: 3513.07 - lr: 0.000009 - momentum: 0.000000
2023-10-16 19:57:18,810 epoch 8 - iter 135/272 - loss 0.01560028 - time (sec): 7.78 - samples/sec: 3519.74 - lr: 0.000008 - momentum: 0.000000
2023-10-16 19:57:20,474 epoch 8 - iter 162/272 - loss 0.01486699 - time (sec): 9.45 - samples/sec: 3456.85 - lr: 0.000008 - momentum: 0.000000
2023-10-16 19:57:21,873 epoch 8 - iter 189/272 - loss 0.01430914 - time (sec): 10.85 - samples/sec: 3448.67 - lr: 0.000008 - momentum: 0.000000
2023-10-16 19:57:23,213 epoch 8 - iter 216/272 - loss 0.01524647 - time (sec): 12.19 - samples/sec: 3422.66 - lr: 0.000007 - momentum: 0.000000
2023-10-16 19:57:24,686 epoch 8 - iter 243/272 - loss 0.01493429 - time (sec): 13.66 - samples/sec: 3398.56 - lr: 0.000007 - momentum: 0.000000
2023-10-16 19:57:26,272 epoch 8 - iter 270/272 - loss 0.01498748 - time (sec): 15.25 - samples/sec: 3398.46 - lr: 0.000007 - momentum: 0.000000
2023-10-16 19:57:26,355 ----------------------------------------------------------------------------------------------------
2023-10-16 19:57:26,355 EPOCH 8 done: loss 0.0153 - lr: 0.000007
2023-10-16 19:57:27,779 DEV : loss 0.15467961132526398 - f1-score (micro avg) 0.8199
2023-10-16 19:57:27,783 ----------------------------------------------------------------------------------------------------
2023-10-16 19:57:29,287 epoch 9 - iter 27/272 - loss 0.02729236 - time (sec): 1.50 - samples/sec: 3297.39 - lr: 0.000006 - momentum: 0.000000
2023-10-16 19:57:31,004 epoch 9 - iter 54/272 - loss 0.01595842 - time (sec): 3.22 - samples/sec: 3355.44 - lr: 0.000006 - momentum: 0.000000
2023-10-16 19:57:32,462 epoch 9 - iter 81/272 - loss 0.01475055 - time (sec): 4.68 - samples/sec: 3375.42 - lr: 0.000006 - momentum: 0.000000
2023-10-16 19:57:34,011 epoch 9 - iter 108/272 - loss 0.01321516 - time (sec): 6.23 - samples/sec: 3437.62 - lr: 0.000005 - momentum: 0.000000
2023-10-16 19:57:35,432 epoch 9 - iter 135/272 - loss 0.01577781 - time (sec): 7.65 - samples/sec: 3366.56 - lr: 0.000005 - momentum: 0.000000
2023-10-16 19:57:37,001 epoch 9 - iter 162/272 - loss 0.01510896 - time (sec): 9.22 - samples/sec: 3434.11 - lr: 0.000005 - momentum: 0.000000
2023-10-16 19:57:38,374 epoch 9 - iter 189/272 - loss 0.01394347 - time (sec): 10.59 - samples/sec: 3375.39 - lr: 0.000004 - momentum: 0.000000
2023-10-16 19:57:39,957 epoch 9 - iter 216/272 - loss 0.01343107 - time (sec): 12.17 - samples/sec: 3371.56 - lr: 0.000004 - momentum: 0.000000
2023-10-16 19:57:41,408 epoch 9 - iter 243/272 - loss 0.01270919 - time (sec): 13.62 - samples/sec: 3347.79 - lr: 0.000004 - momentum: 0.000000
2023-10-16 19:57:43,155 epoch 9 - iter 270/272 - loss 0.01263245 - time (sec): 15.37 - samples/sec: 3366.15 - lr: 0.000003 - momentum: 0.000000
2023-10-16 19:57:43,246 ----------------------------------------------------------------------------------------------------
2023-10-16 19:57:43,246 EPOCH 9 done: loss 0.0126 - lr: 0.000003
2023-10-16 19:57:44,689 DEV : loss 0.15767242014408112 - f1-score (micro avg) 0.833
2023-10-16 19:57:44,694 ----------------------------------------------------------------------------------------------------
2023-10-16 19:57:46,340 epoch 10 - iter 27/272 - loss 0.00835601 - time (sec): 1.64 - samples/sec: 3466.33 - lr: 0.000003 - momentum: 0.000000
2023-10-16 19:57:47,783 epoch 10 - iter 54/272 - loss 0.00485101 - time (sec): 3.09 - samples/sec: 3283.50 - lr: 0.000003 - momentum: 0.000000
2023-10-16 19:57:49,267 epoch 10 - iter 81/272 - loss 0.00584821 - time (sec): 4.57 - samples/sec: 3248.55 - lr: 0.000002 - momentum: 0.000000
2023-10-16 19:57:50,925 epoch 10 - iter 108/272 - loss 0.00558808 - time (sec): 6.23 - samples/sec: 3275.67 - lr: 0.000002 - momentum: 0.000000
2023-10-16 19:57:52,542 epoch 10 - iter 135/272 - loss 0.00481861 - time (sec): 7.85 - samples/sec: 3322.11 - lr: 0.000002 - momentum: 0.000000
2023-10-16 19:57:54,193 epoch 10 - iter 162/272 - loss 0.00836834 - time (sec): 9.50 - samples/sec: 3313.17 - lr: 0.000001 - momentum: 0.000000
2023-10-16 19:57:55,900 epoch 10 - iter 189/272 - loss 0.00829871 - time (sec): 11.20 - samples/sec: 3295.75 - lr: 0.000001 - momentum: 0.000000
2023-10-16 19:57:57,412 epoch 10 - iter 216/272 - loss 0.00879782 - time (sec): 12.72 - samples/sec: 3265.60 - lr: 0.000001 - momentum: 0.000000
2023-10-16 19:57:58,914 epoch 10 - iter 243/272 - loss 0.00937661 - time (sec): 14.22 - samples/sec: 3261.54 - lr: 0.000000 - momentum: 0.000000
2023-10-16 19:58:00,543 epoch 10 - iter 270/272 - loss 0.00993645 - time (sec): 15.85 - samples/sec: 3265.36 - lr: 0.000000 - momentum: 0.000000
2023-10-16 19:58:00,639 ----------------------------------------------------------------------------------------------------
2023-10-16 19:58:00,640 EPOCH 10 done: loss 0.0099 - lr: 0.000000
2023-10-16 19:58:02,063 DEV : loss 0.16342675685882568 - f1-score (micro avg) 0.819
2023-10-16 19:58:02,408 ----------------------------------------------------------------------------------------------------
2023-10-16 19:58:02,409 Loading model from best epoch ...
2023-10-16 19:58:04,004 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-16 19:58:05,970
Results:
- F-score (micro) 0.7955
- F-score (macro) 0.7533
- Accuracy 0.6786
By class:
precision recall f1-score support
LOC 0.8212 0.8686 0.8442 312
PER 0.7344 0.8510 0.7884 208
ORG 0.5000 0.4000 0.4444 55
HumanProd 0.8800 1.0000 0.9362 22
micro avg 0.7688 0.8241 0.7955 597
macro avg 0.7339 0.7799 0.7533 597
weighted avg 0.7636 0.8241 0.7913 597
2023-10-16 19:58:05,970 ----------------------------------------------------------------------------------------------------