stefan-it's picture
Upload folder using huggingface_hub
3e034df
2023-10-16 19:45:08,871 ----------------------------------------------------------------------------------------------------
2023-10-16 19:45:08,872 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 19:45:08,872 ----------------------------------------------------------------------------------------------------
2023-10-16 19:45:08,872 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-16 19:45:08,872 ----------------------------------------------------------------------------------------------------
2023-10-16 19:45:08,872 Train: 1085 sentences
2023-10-16 19:45:08,872 (train_with_dev=False, train_with_test=False)
2023-10-16 19:45:08,872 ----------------------------------------------------------------------------------------------------
2023-10-16 19:45:08,872 Training Params:
2023-10-16 19:45:08,872 - learning_rate: "5e-05"
2023-10-16 19:45:08,872 - mini_batch_size: "4"
2023-10-16 19:45:08,872 - max_epochs: "10"
2023-10-16 19:45:08,872 - shuffle: "True"
2023-10-16 19:45:08,872 ----------------------------------------------------------------------------------------------------
2023-10-16 19:45:08,872 Plugins:
2023-10-16 19:45:08,872 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 19:45:08,872 ----------------------------------------------------------------------------------------------------
2023-10-16 19:45:08,872 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 19:45:08,872 - metric: "('micro avg', 'f1-score')"
2023-10-16 19:45:08,872 ----------------------------------------------------------------------------------------------------
2023-10-16 19:45:08,872 Computation:
2023-10-16 19:45:08,872 - compute on device: cuda:0
2023-10-16 19:45:08,872 - embedding storage: none
2023-10-16 19:45:08,873 ----------------------------------------------------------------------------------------------------
2023-10-16 19:45:08,873 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-16 19:45:08,873 ----------------------------------------------------------------------------------------------------
2023-10-16 19:45:08,873 ----------------------------------------------------------------------------------------------------
2023-10-16 19:45:10,678 epoch 1 - iter 27/272 - loss 2.96086262 - time (sec): 1.80 - samples/sec: 3480.03 - lr: 0.000005 - momentum: 0.000000
2023-10-16 19:45:12,125 epoch 1 - iter 54/272 - loss 2.36130554 - time (sec): 3.25 - samples/sec: 3370.99 - lr: 0.000010 - momentum: 0.000000
2023-10-16 19:45:13,608 epoch 1 - iter 81/272 - loss 1.79808365 - time (sec): 4.73 - samples/sec: 3399.34 - lr: 0.000015 - momentum: 0.000000
2023-10-16 19:45:15,120 epoch 1 - iter 108/272 - loss 1.50600816 - time (sec): 6.25 - samples/sec: 3338.13 - lr: 0.000020 - momentum: 0.000000
2023-10-16 19:45:16,814 epoch 1 - iter 135/272 - loss 1.26819459 - time (sec): 7.94 - samples/sec: 3326.24 - lr: 0.000025 - momentum: 0.000000
2023-10-16 19:45:18,426 epoch 1 - iter 162/272 - loss 1.12275943 - time (sec): 9.55 - samples/sec: 3308.46 - lr: 0.000030 - momentum: 0.000000
2023-10-16 19:45:20,100 epoch 1 - iter 189/272 - loss 0.99283442 - time (sec): 11.23 - samples/sec: 3339.66 - lr: 0.000035 - momentum: 0.000000
2023-10-16 19:45:21,612 epoch 1 - iter 216/272 - loss 0.91857983 - time (sec): 12.74 - samples/sec: 3290.94 - lr: 0.000040 - momentum: 0.000000
2023-10-16 19:45:23,087 epoch 1 - iter 243/272 - loss 0.85492511 - time (sec): 14.21 - samples/sec: 3292.10 - lr: 0.000044 - momentum: 0.000000
2023-10-16 19:45:24,709 epoch 1 - iter 270/272 - loss 0.79776594 - time (sec): 15.84 - samples/sec: 3260.44 - lr: 0.000049 - momentum: 0.000000
2023-10-16 19:45:24,835 ----------------------------------------------------------------------------------------------------
2023-10-16 19:45:24,835 EPOCH 1 done: loss 0.7948 - lr: 0.000049
2023-10-16 19:45:25,760 DEV : loss 0.15580207109451294 - f1-score (micro avg) 0.663
2023-10-16 19:45:25,769 saving best model
2023-10-16 19:45:26,201 ----------------------------------------------------------------------------------------------------
2023-10-16 19:45:27,812 epoch 2 - iter 27/272 - loss 0.14714516 - time (sec): 1.61 - samples/sec: 3341.25 - lr: 0.000049 - momentum: 0.000000
2023-10-16 19:45:29,482 epoch 2 - iter 54/272 - loss 0.15012254 - time (sec): 3.28 - samples/sec: 3322.33 - lr: 0.000049 - momentum: 0.000000
2023-10-16 19:45:31,464 epoch 2 - iter 81/272 - loss 0.15178807 - time (sec): 5.26 - samples/sec: 3282.77 - lr: 0.000048 - momentum: 0.000000
2023-10-16 19:45:33,059 epoch 2 - iter 108/272 - loss 0.16371495 - time (sec): 6.86 - samples/sec: 3245.66 - lr: 0.000048 - momentum: 0.000000
2023-10-16 19:45:34,472 epoch 2 - iter 135/272 - loss 0.16423471 - time (sec): 8.27 - samples/sec: 3245.58 - lr: 0.000047 - momentum: 0.000000
2023-10-16 19:45:35,972 epoch 2 - iter 162/272 - loss 0.15787672 - time (sec): 9.77 - samples/sec: 3277.95 - lr: 0.000047 - momentum: 0.000000
2023-10-16 19:45:37,380 epoch 2 - iter 189/272 - loss 0.15891193 - time (sec): 11.18 - samples/sec: 3247.36 - lr: 0.000046 - momentum: 0.000000
2023-10-16 19:45:38,984 epoch 2 - iter 216/272 - loss 0.14944095 - time (sec): 12.78 - samples/sec: 3302.38 - lr: 0.000046 - momentum: 0.000000
2023-10-16 19:45:40,408 epoch 2 - iter 243/272 - loss 0.15144390 - time (sec): 14.21 - samples/sec: 3280.60 - lr: 0.000045 - momentum: 0.000000
2023-10-16 19:45:41,987 epoch 2 - iter 270/272 - loss 0.15335946 - time (sec): 15.78 - samples/sec: 3268.76 - lr: 0.000045 - momentum: 0.000000
2023-10-16 19:45:42,123 ----------------------------------------------------------------------------------------------------
2023-10-16 19:45:42,124 EPOCH 2 done: loss 0.1529 - lr: 0.000045
2023-10-16 19:45:43,604 DEV : loss 0.10786943882703781 - f1-score (micro avg) 0.7299
2023-10-16 19:45:43,610 saving best model
2023-10-16 19:45:44,174 ----------------------------------------------------------------------------------------------------
2023-10-16 19:45:45,775 epoch 3 - iter 27/272 - loss 0.11497974 - time (sec): 1.60 - samples/sec: 3353.49 - lr: 0.000044 - momentum: 0.000000
2023-10-16 19:45:47,367 epoch 3 - iter 54/272 - loss 0.11897320 - time (sec): 3.19 - samples/sec: 3381.17 - lr: 0.000043 - momentum: 0.000000
2023-10-16 19:45:48,752 epoch 3 - iter 81/272 - loss 0.10305969 - time (sec): 4.58 - samples/sec: 3299.87 - lr: 0.000043 - momentum: 0.000000
2023-10-16 19:45:50,470 epoch 3 - iter 108/272 - loss 0.09887392 - time (sec): 6.29 - samples/sec: 3243.54 - lr: 0.000042 - momentum: 0.000000
2023-10-16 19:45:52,108 epoch 3 - iter 135/272 - loss 0.10322268 - time (sec): 7.93 - samples/sec: 3222.86 - lr: 0.000042 - momentum: 0.000000
2023-10-16 19:45:53,627 epoch 3 - iter 162/272 - loss 0.10022417 - time (sec): 9.45 - samples/sec: 3239.27 - lr: 0.000041 - momentum: 0.000000
2023-10-16 19:45:55,059 epoch 3 - iter 189/272 - loss 0.10007514 - time (sec): 10.88 - samples/sec: 3210.32 - lr: 0.000041 - momentum: 0.000000
2023-10-16 19:45:56,642 epoch 3 - iter 216/272 - loss 0.09311613 - time (sec): 12.47 - samples/sec: 3270.13 - lr: 0.000040 - momentum: 0.000000
2023-10-16 19:45:58,193 epoch 3 - iter 243/272 - loss 0.08965973 - time (sec): 14.02 - samples/sec: 3281.12 - lr: 0.000040 - momentum: 0.000000
2023-10-16 19:45:59,910 epoch 3 - iter 270/272 - loss 0.08566273 - time (sec): 15.73 - samples/sec: 3286.40 - lr: 0.000039 - momentum: 0.000000
2023-10-16 19:46:00,002 ----------------------------------------------------------------------------------------------------
2023-10-16 19:46:00,002 EPOCH 3 done: loss 0.0852 - lr: 0.000039
2023-10-16 19:46:01,437 DEV : loss 0.12968279421329498 - f1-score (micro avg) 0.7607
2023-10-16 19:46:01,441 saving best model
2023-10-16 19:46:01,910 ----------------------------------------------------------------------------------------------------
2023-10-16 19:46:03,507 epoch 4 - iter 27/272 - loss 0.06325830 - time (sec): 1.59 - samples/sec: 3087.79 - lr: 0.000038 - momentum: 0.000000
2023-10-16 19:46:04,958 epoch 4 - iter 54/272 - loss 0.05768070 - time (sec): 3.04 - samples/sec: 3048.52 - lr: 0.000038 - momentum: 0.000000
2023-10-16 19:46:06,497 epoch 4 - iter 81/272 - loss 0.06187859 - time (sec): 4.58 - samples/sec: 3239.23 - lr: 0.000037 - momentum: 0.000000
2023-10-16 19:46:08,091 epoch 4 - iter 108/272 - loss 0.05467391 - time (sec): 6.17 - samples/sec: 3244.93 - lr: 0.000037 - momentum: 0.000000
2023-10-16 19:46:09,593 epoch 4 - iter 135/272 - loss 0.05895317 - time (sec): 7.68 - samples/sec: 3263.99 - lr: 0.000036 - momentum: 0.000000
2023-10-16 19:46:11,140 epoch 4 - iter 162/272 - loss 0.05428778 - time (sec): 9.22 - samples/sec: 3279.22 - lr: 0.000036 - momentum: 0.000000
2023-10-16 19:46:12,830 epoch 4 - iter 189/272 - loss 0.05499807 - time (sec): 10.91 - samples/sec: 3261.29 - lr: 0.000035 - momentum: 0.000000
2023-10-16 19:46:14,528 epoch 4 - iter 216/272 - loss 0.05405387 - time (sec): 12.61 - samples/sec: 3255.85 - lr: 0.000034 - momentum: 0.000000
2023-10-16 19:46:16,104 epoch 4 - iter 243/272 - loss 0.05087587 - time (sec): 14.19 - samples/sec: 3254.79 - lr: 0.000034 - momentum: 0.000000
2023-10-16 19:46:17,735 epoch 4 - iter 270/272 - loss 0.05246963 - time (sec): 15.82 - samples/sec: 3280.27 - lr: 0.000033 - momentum: 0.000000
2023-10-16 19:46:17,816 ----------------------------------------------------------------------------------------------------
2023-10-16 19:46:17,816 EPOCH 4 done: loss 0.0531 - lr: 0.000033
2023-10-16 19:46:19,246 DEV : loss 0.1382606476545334 - f1-score (micro avg) 0.7804
2023-10-16 19:46:19,251 saving best model
2023-10-16 19:46:19,747 ----------------------------------------------------------------------------------------------------
2023-10-16 19:46:21,212 epoch 5 - iter 27/272 - loss 0.02778358 - time (sec): 1.46 - samples/sec: 3105.59 - lr: 0.000033 - momentum: 0.000000
2023-10-16 19:46:22,688 epoch 5 - iter 54/272 - loss 0.02467950 - time (sec): 2.94 - samples/sec: 3262.26 - lr: 0.000032 - momentum: 0.000000
2023-10-16 19:46:24,296 epoch 5 - iter 81/272 - loss 0.03050316 - time (sec): 4.55 - samples/sec: 3337.02 - lr: 0.000032 - momentum: 0.000000
2023-10-16 19:46:25,861 epoch 5 - iter 108/272 - loss 0.02863410 - time (sec): 6.11 - samples/sec: 3368.48 - lr: 0.000031 - momentum: 0.000000
2023-10-16 19:46:27,445 epoch 5 - iter 135/272 - loss 0.02590019 - time (sec): 7.69 - samples/sec: 3328.21 - lr: 0.000031 - momentum: 0.000000
2023-10-16 19:46:28,998 epoch 5 - iter 162/272 - loss 0.02918530 - time (sec): 9.25 - samples/sec: 3350.04 - lr: 0.000030 - momentum: 0.000000
2023-10-16 19:46:30,642 epoch 5 - iter 189/272 - loss 0.03505049 - time (sec): 10.89 - samples/sec: 3332.87 - lr: 0.000029 - momentum: 0.000000
2023-10-16 19:46:32,147 epoch 5 - iter 216/272 - loss 0.03695742 - time (sec): 12.40 - samples/sec: 3348.97 - lr: 0.000029 - momentum: 0.000000
2023-10-16 19:46:33,693 epoch 5 - iter 243/272 - loss 0.03836264 - time (sec): 13.94 - samples/sec: 3324.84 - lr: 0.000028 - momentum: 0.000000
2023-10-16 19:46:35,304 epoch 5 - iter 270/272 - loss 0.04024680 - time (sec): 15.55 - samples/sec: 3318.24 - lr: 0.000028 - momentum: 0.000000
2023-10-16 19:46:35,410 ----------------------------------------------------------------------------------------------------
2023-10-16 19:46:35,411 EPOCH 5 done: loss 0.0404 - lr: 0.000028
2023-10-16 19:46:37,035 DEV : loss 0.13831038773059845 - f1-score (micro avg) 0.8281
2023-10-16 19:46:37,039 saving best model
2023-10-16 19:46:37,537 ----------------------------------------------------------------------------------------------------
2023-10-16 19:46:39,141 epoch 6 - iter 27/272 - loss 0.02201561 - time (sec): 1.60 - samples/sec: 3173.74 - lr: 0.000027 - momentum: 0.000000
2023-10-16 19:46:40,759 epoch 6 - iter 54/272 - loss 0.02366348 - time (sec): 3.22 - samples/sec: 3202.67 - lr: 0.000027 - momentum: 0.000000
2023-10-16 19:46:42,240 epoch 6 - iter 81/272 - loss 0.02393184 - time (sec): 4.70 - samples/sec: 3243.73 - lr: 0.000026 - momentum: 0.000000
2023-10-16 19:46:43,674 epoch 6 - iter 108/272 - loss 0.02461930 - time (sec): 6.13 - samples/sec: 3228.93 - lr: 0.000026 - momentum: 0.000000
2023-10-16 19:46:45,268 epoch 6 - iter 135/272 - loss 0.03024604 - time (sec): 7.73 - samples/sec: 3287.11 - lr: 0.000025 - momentum: 0.000000
2023-10-16 19:46:46,899 epoch 6 - iter 162/272 - loss 0.03209193 - time (sec): 9.36 - samples/sec: 3323.90 - lr: 0.000024 - momentum: 0.000000
2023-10-16 19:46:48,464 epoch 6 - iter 189/272 - loss 0.02936378 - time (sec): 10.92 - samples/sec: 3333.47 - lr: 0.000024 - momentum: 0.000000
2023-10-16 19:46:50,208 epoch 6 - iter 216/272 - loss 0.02718790 - time (sec): 12.67 - samples/sec: 3331.93 - lr: 0.000023 - momentum: 0.000000
2023-10-16 19:46:51,727 epoch 6 - iter 243/272 - loss 0.02497498 - time (sec): 14.19 - samples/sec: 3327.98 - lr: 0.000023 - momentum: 0.000000
2023-10-16 19:46:53,259 epoch 6 - iter 270/272 - loss 0.02583978 - time (sec): 15.72 - samples/sec: 3301.49 - lr: 0.000022 - momentum: 0.000000
2023-10-16 19:46:53,349 ----------------------------------------------------------------------------------------------------
2023-10-16 19:46:53,349 EPOCH 6 done: loss 0.0259 - lr: 0.000022
2023-10-16 19:46:54,778 DEV : loss 0.16786816716194153 - f1-score (micro avg) 0.829
2023-10-16 19:46:54,782 saving best model
2023-10-16 19:46:55,269 ----------------------------------------------------------------------------------------------------
2023-10-16 19:46:56,998 epoch 7 - iter 27/272 - loss 0.01192496 - time (sec): 1.72 - samples/sec: 3108.05 - lr: 0.000022 - momentum: 0.000000
2023-10-16 19:46:58,498 epoch 7 - iter 54/272 - loss 0.01313309 - time (sec): 3.22 - samples/sec: 3116.45 - lr: 0.000021 - momentum: 0.000000
2023-10-16 19:47:00,113 epoch 7 - iter 81/272 - loss 0.01828642 - time (sec): 4.84 - samples/sec: 3302.04 - lr: 0.000021 - momentum: 0.000000
2023-10-16 19:47:01,549 epoch 7 - iter 108/272 - loss 0.01997851 - time (sec): 6.27 - samples/sec: 3220.44 - lr: 0.000020 - momentum: 0.000000
2023-10-16 19:47:03,099 epoch 7 - iter 135/272 - loss 0.01792751 - time (sec): 7.82 - samples/sec: 3184.97 - lr: 0.000019 - momentum: 0.000000
2023-10-16 19:47:04,732 epoch 7 - iter 162/272 - loss 0.01681276 - time (sec): 9.46 - samples/sec: 3254.06 - lr: 0.000019 - momentum: 0.000000
2023-10-16 19:47:06,392 epoch 7 - iter 189/272 - loss 0.01581105 - time (sec): 11.12 - samples/sec: 3294.33 - lr: 0.000018 - momentum: 0.000000
2023-10-16 19:47:08,055 epoch 7 - iter 216/272 - loss 0.01718819 - time (sec): 12.78 - samples/sec: 3281.35 - lr: 0.000018 - momentum: 0.000000
2023-10-16 19:47:09,600 epoch 7 - iter 243/272 - loss 0.01668342 - time (sec): 14.33 - samples/sec: 3274.57 - lr: 0.000017 - momentum: 0.000000
2023-10-16 19:47:11,136 epoch 7 - iter 270/272 - loss 0.01703726 - time (sec): 15.86 - samples/sec: 3270.62 - lr: 0.000017 - momentum: 0.000000
2023-10-16 19:47:11,225 ----------------------------------------------------------------------------------------------------
2023-10-16 19:47:11,225 EPOCH 7 done: loss 0.0171 - lr: 0.000017
2023-10-16 19:47:12,660 DEV : loss 0.1661759316921234 - f1-score (micro avg) 0.816
2023-10-16 19:47:12,665 ----------------------------------------------------------------------------------------------------
2023-10-16 19:47:14,262 epoch 8 - iter 27/272 - loss 0.00878954 - time (sec): 1.60 - samples/sec: 3343.98 - lr: 0.000016 - momentum: 0.000000
2023-10-16 19:47:15,837 epoch 8 - iter 54/272 - loss 0.01349225 - time (sec): 3.17 - samples/sec: 3311.94 - lr: 0.000016 - momentum: 0.000000
2023-10-16 19:47:17,393 epoch 8 - iter 81/272 - loss 0.01362705 - time (sec): 4.73 - samples/sec: 3256.86 - lr: 0.000015 - momentum: 0.000000
2023-10-16 19:47:18,911 epoch 8 - iter 108/272 - loss 0.01720786 - time (sec): 6.25 - samples/sec: 3296.03 - lr: 0.000014 - momentum: 0.000000
2023-10-16 19:47:20,416 epoch 8 - iter 135/272 - loss 0.01657915 - time (sec): 7.75 - samples/sec: 3270.86 - lr: 0.000014 - momentum: 0.000000
2023-10-16 19:47:22,232 epoch 8 - iter 162/272 - loss 0.01707526 - time (sec): 9.57 - samples/sec: 3318.88 - lr: 0.000013 - momentum: 0.000000
2023-10-16 19:47:23,687 epoch 8 - iter 189/272 - loss 0.01518853 - time (sec): 11.02 - samples/sec: 3300.73 - lr: 0.000013 - momentum: 0.000000
2023-10-16 19:47:25,183 epoch 8 - iter 216/272 - loss 0.01442840 - time (sec): 12.52 - samples/sec: 3319.84 - lr: 0.000012 - momentum: 0.000000
2023-10-16 19:47:26,605 epoch 8 - iter 243/272 - loss 0.01387485 - time (sec): 13.94 - samples/sec: 3297.79 - lr: 0.000012 - momentum: 0.000000
2023-10-16 19:47:28,370 epoch 8 - iter 270/272 - loss 0.01535666 - time (sec): 15.70 - samples/sec: 3302.33 - lr: 0.000011 - momentum: 0.000000
2023-10-16 19:47:28,453 ----------------------------------------------------------------------------------------------------
2023-10-16 19:47:28,454 EPOCH 8 done: loss 0.0153 - lr: 0.000011
2023-10-16 19:47:29,907 DEV : loss 0.16990669071674347 - f1-score (micro avg) 0.8168
2023-10-16 19:47:29,912 ----------------------------------------------------------------------------------------------------
2023-10-16 19:47:31,702 epoch 9 - iter 27/272 - loss 0.00924822 - time (sec): 1.79 - samples/sec: 3811.79 - lr: 0.000011 - momentum: 0.000000
2023-10-16 19:47:33,246 epoch 9 - iter 54/272 - loss 0.00584536 - time (sec): 3.33 - samples/sec: 3549.08 - lr: 0.000010 - momentum: 0.000000
2023-10-16 19:47:34,939 epoch 9 - iter 81/272 - loss 0.00695411 - time (sec): 5.03 - samples/sec: 3275.93 - lr: 0.000009 - momentum: 0.000000
2023-10-16 19:47:36,477 epoch 9 - iter 108/272 - loss 0.00913707 - time (sec): 6.56 - samples/sec: 3308.04 - lr: 0.000009 - momentum: 0.000000
2023-10-16 19:47:38,135 epoch 9 - iter 135/272 - loss 0.00808347 - time (sec): 8.22 - samples/sec: 3302.30 - lr: 0.000008 - momentum: 0.000000
2023-10-16 19:47:39,677 epoch 9 - iter 162/272 - loss 0.00873095 - time (sec): 9.76 - samples/sec: 3269.77 - lr: 0.000008 - momentum: 0.000000
2023-10-16 19:47:41,210 epoch 9 - iter 189/272 - loss 0.00838996 - time (sec): 11.30 - samples/sec: 3294.32 - lr: 0.000007 - momentum: 0.000000
2023-10-16 19:47:42,757 epoch 9 - iter 216/272 - loss 0.00860744 - time (sec): 12.84 - samples/sec: 3270.87 - lr: 0.000007 - momentum: 0.000000
2023-10-16 19:47:44,332 epoch 9 - iter 243/272 - loss 0.00878071 - time (sec): 14.42 - samples/sec: 3269.45 - lr: 0.000006 - momentum: 0.000000
2023-10-16 19:47:45,829 epoch 9 - iter 270/272 - loss 0.00852981 - time (sec): 15.92 - samples/sec: 3256.72 - lr: 0.000006 - momentum: 0.000000
2023-10-16 19:47:45,921 ----------------------------------------------------------------------------------------------------
2023-10-16 19:47:45,921 EPOCH 9 done: loss 0.0085 - lr: 0.000006
2023-10-16 19:47:47,348 DEV : loss 0.1631445735692978 - f1-score (micro avg) 0.8278
2023-10-16 19:47:47,352 ----------------------------------------------------------------------------------------------------
2023-10-16 19:47:48,866 epoch 10 - iter 27/272 - loss 0.00818875 - time (sec): 1.51 - samples/sec: 3617.91 - lr: 0.000005 - momentum: 0.000000
2023-10-16 19:47:50,236 epoch 10 - iter 54/272 - loss 0.00537678 - time (sec): 2.88 - samples/sec: 3331.19 - lr: 0.000004 - momentum: 0.000000
2023-10-16 19:47:51,839 epoch 10 - iter 81/272 - loss 0.00434446 - time (sec): 4.49 - samples/sec: 3324.57 - lr: 0.000004 - momentum: 0.000000
2023-10-16 19:47:53,372 epoch 10 - iter 108/272 - loss 0.00584214 - time (sec): 6.02 - samples/sec: 3338.86 - lr: 0.000003 - momentum: 0.000000
2023-10-16 19:47:54,917 epoch 10 - iter 135/272 - loss 0.00657058 - time (sec): 7.56 - samples/sec: 3321.54 - lr: 0.000003 - momentum: 0.000000
2023-10-16 19:47:56,668 epoch 10 - iter 162/272 - loss 0.00641731 - time (sec): 9.31 - samples/sec: 3306.33 - lr: 0.000002 - momentum: 0.000000
2023-10-16 19:47:58,254 epoch 10 - iter 189/272 - loss 0.00705789 - time (sec): 10.90 - samples/sec: 3309.06 - lr: 0.000002 - momentum: 0.000000
2023-10-16 19:47:59,767 epoch 10 - iter 216/272 - loss 0.00738649 - time (sec): 12.41 - samples/sec: 3293.63 - lr: 0.000001 - momentum: 0.000000
2023-10-16 19:48:01,330 epoch 10 - iter 243/272 - loss 0.00671284 - time (sec): 13.98 - samples/sec: 3278.61 - lr: 0.000001 - momentum: 0.000000
2023-10-16 19:48:02,982 epoch 10 - iter 270/272 - loss 0.00632955 - time (sec): 15.63 - samples/sec: 3315.19 - lr: 0.000000 - momentum: 0.000000
2023-10-16 19:48:03,063 ----------------------------------------------------------------------------------------------------
2023-10-16 19:48:03,063 EPOCH 10 done: loss 0.0063 - lr: 0.000000
2023-10-16 19:48:04,504 DEV : loss 0.1695002317428589 - f1-score (micro avg) 0.8275
2023-10-16 19:48:04,845 ----------------------------------------------------------------------------------------------------
2023-10-16 19:48:04,846 Loading model from best epoch ...
2023-10-16 19:48:06,389 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-16 19:48:08,347
Results:
- F-score (micro) 0.7847
- F-score (macro) 0.746
- Accuracy 0.6601
By class:
precision recall f1-score support
LOC 0.7865 0.8974 0.8383 312
PER 0.6822 0.8462 0.7554 208
ORG 0.5814 0.4545 0.5102 55
HumanProd 0.7857 1.0000 0.8800 22
micro avg 0.7343 0.8425 0.7847 597
macro avg 0.7089 0.7995 0.7460 597
weighted avg 0.7312 0.8425 0.7807 597
2023-10-16 19:48:08,347 ----------------------------------------------------------------------------------------------------