stefan-it's picture
Upload ./training.log with huggingface_hub
675105e
2023-10-25 21:39:22,701 ----------------------------------------------------------------------------------------------------
2023-10-25 21:39:22,702 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 21:39:22,702 ----------------------------------------------------------------------------------------------------
2023-10-25 21:39:22,703 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-25 21:39:22,703 ----------------------------------------------------------------------------------------------------
2023-10-25 21:39:22,703 Train: 1085 sentences
2023-10-25 21:39:22,703 (train_with_dev=False, train_with_test=False)
2023-10-25 21:39:22,703 ----------------------------------------------------------------------------------------------------
2023-10-25 21:39:22,703 Training Params:
2023-10-25 21:39:22,703 - learning_rate: "5e-05"
2023-10-25 21:39:22,703 - mini_batch_size: "4"
2023-10-25 21:39:22,703 - max_epochs: "10"
2023-10-25 21:39:22,703 - shuffle: "True"
2023-10-25 21:39:22,703 ----------------------------------------------------------------------------------------------------
2023-10-25 21:39:22,703 Plugins:
2023-10-25 21:39:22,703 - TensorboardLogger
2023-10-25 21:39:22,703 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 21:39:22,703 ----------------------------------------------------------------------------------------------------
2023-10-25 21:39:22,703 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 21:39:22,704 - metric: "('micro avg', 'f1-score')"
2023-10-25 21:39:22,704 ----------------------------------------------------------------------------------------------------
2023-10-25 21:39:22,704 Computation:
2023-10-25 21:39:22,704 - compute on device: cuda:0
2023-10-25 21:39:22,704 - embedding storage: none
2023-10-25 21:39:22,704 ----------------------------------------------------------------------------------------------------
2023-10-25 21:39:22,704 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-25 21:39:22,704 ----------------------------------------------------------------------------------------------------
2023-10-25 21:39:22,704 ----------------------------------------------------------------------------------------------------
2023-10-25 21:39:22,704 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 21:39:24,155 epoch 1 - iter 27/272 - loss 2.49805446 - time (sec): 1.45 - samples/sec: 3540.00 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:39:25,675 epoch 1 - iter 54/272 - loss 1.67544873 - time (sec): 2.97 - samples/sec: 3496.77 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:39:27,155 epoch 1 - iter 81/272 - loss 1.26708392 - time (sec): 4.45 - samples/sec: 3630.39 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:39:28,663 epoch 1 - iter 108/272 - loss 1.04932330 - time (sec): 5.96 - samples/sec: 3687.73 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:39:30,181 epoch 1 - iter 135/272 - loss 0.90812403 - time (sec): 7.48 - samples/sec: 3600.75 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:39:31,693 epoch 1 - iter 162/272 - loss 0.80144087 - time (sec): 8.99 - samples/sec: 3564.49 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:39:33,167 epoch 1 - iter 189/272 - loss 0.71754388 - time (sec): 10.46 - samples/sec: 3591.83 - lr: 0.000035 - momentum: 0.000000
2023-10-25 21:39:34,661 epoch 1 - iter 216/272 - loss 0.66141360 - time (sec): 11.96 - samples/sec: 3525.30 - lr: 0.000040 - momentum: 0.000000
2023-10-25 21:39:36,195 epoch 1 - iter 243/272 - loss 0.60466834 - time (sec): 13.49 - samples/sec: 3541.07 - lr: 0.000044 - momentum: 0.000000
2023-10-25 21:39:37,648 epoch 1 - iter 270/272 - loss 0.57299630 - time (sec): 14.94 - samples/sec: 3462.38 - lr: 0.000049 - momentum: 0.000000
2023-10-25 21:39:37,745 ----------------------------------------------------------------------------------------------------
2023-10-25 21:39:37,745 EPOCH 1 done: loss 0.5710 - lr: 0.000049
2023-10-25 21:39:38,421 DEV : loss 0.1312190741300583 - f1-score (micro avg) 0.6734
2023-10-25 21:39:38,428 saving best model
2023-10-25 21:39:38,939 ----------------------------------------------------------------------------------------------------
2023-10-25 21:39:40,412 epoch 2 - iter 27/272 - loss 0.13994969 - time (sec): 1.47 - samples/sec: 3542.86 - lr: 0.000049 - momentum: 0.000000
2023-10-25 21:39:41,895 epoch 2 - iter 54/272 - loss 0.14777439 - time (sec): 2.95 - samples/sec: 3635.29 - lr: 0.000049 - momentum: 0.000000
2023-10-25 21:39:43,400 epoch 2 - iter 81/272 - loss 0.14433581 - time (sec): 4.46 - samples/sec: 3437.04 - lr: 0.000048 - momentum: 0.000000
2023-10-25 21:39:44,891 epoch 2 - iter 108/272 - loss 0.13742883 - time (sec): 5.95 - samples/sec: 3398.38 - lr: 0.000048 - momentum: 0.000000
2023-10-25 21:39:46,405 epoch 2 - iter 135/272 - loss 0.12798837 - time (sec): 7.46 - samples/sec: 3425.00 - lr: 0.000047 - momentum: 0.000000
2023-10-25 21:39:47,911 epoch 2 - iter 162/272 - loss 0.13103800 - time (sec): 8.97 - samples/sec: 3424.63 - lr: 0.000047 - momentum: 0.000000
2023-10-25 21:39:49,449 epoch 2 - iter 189/272 - loss 0.13367852 - time (sec): 10.51 - samples/sec: 3472.65 - lr: 0.000046 - momentum: 0.000000
2023-10-25 21:39:50,888 epoch 2 - iter 216/272 - loss 0.13388695 - time (sec): 11.95 - samples/sec: 3419.50 - lr: 0.000046 - momentum: 0.000000
2023-10-25 21:39:52,414 epoch 2 - iter 243/272 - loss 0.13310624 - time (sec): 13.47 - samples/sec: 3458.75 - lr: 0.000045 - momentum: 0.000000
2023-10-25 21:39:53,891 epoch 2 - iter 270/272 - loss 0.12902925 - time (sec): 14.95 - samples/sec: 3448.09 - lr: 0.000045 - momentum: 0.000000
2023-10-25 21:39:54,014 ----------------------------------------------------------------------------------------------------
2023-10-25 21:39:54,015 EPOCH 2 done: loss 0.1288 - lr: 0.000045
2023-10-25 21:39:55,628 DEV : loss 0.11965033411979675 - f1-score (micro avg) 0.7882
2023-10-25 21:39:55,635 saving best model
2023-10-25 21:39:56,316 ----------------------------------------------------------------------------------------------------
2023-10-25 21:39:57,805 epoch 3 - iter 27/272 - loss 0.07229947 - time (sec): 1.49 - samples/sec: 2928.53 - lr: 0.000044 - momentum: 0.000000
2023-10-25 21:39:59,248 epoch 3 - iter 54/272 - loss 0.07657749 - time (sec): 2.93 - samples/sec: 3221.53 - lr: 0.000043 - momentum: 0.000000
2023-10-25 21:40:00,792 epoch 3 - iter 81/272 - loss 0.06566703 - time (sec): 4.47 - samples/sec: 3350.84 - lr: 0.000043 - momentum: 0.000000
2023-10-25 21:40:02,256 epoch 3 - iter 108/272 - loss 0.07494847 - time (sec): 5.94 - samples/sec: 3275.73 - lr: 0.000042 - momentum: 0.000000
2023-10-25 21:40:03,887 epoch 3 - iter 135/272 - loss 0.06987444 - time (sec): 7.57 - samples/sec: 3372.36 - lr: 0.000042 - momentum: 0.000000
2023-10-25 21:40:05,534 epoch 3 - iter 162/272 - loss 0.06586629 - time (sec): 9.22 - samples/sec: 3397.13 - lr: 0.000041 - momentum: 0.000000
2023-10-25 21:40:07,052 epoch 3 - iter 189/272 - loss 0.06688391 - time (sec): 10.73 - samples/sec: 3437.33 - lr: 0.000041 - momentum: 0.000000
2023-10-25 21:40:08,532 epoch 3 - iter 216/272 - loss 0.06969988 - time (sec): 12.21 - samples/sec: 3392.47 - lr: 0.000040 - momentum: 0.000000
2023-10-25 21:40:10,001 epoch 3 - iter 243/272 - loss 0.06930302 - time (sec): 13.68 - samples/sec: 3371.69 - lr: 0.000040 - momentum: 0.000000
2023-10-25 21:40:11,506 epoch 3 - iter 270/272 - loss 0.06894899 - time (sec): 15.19 - samples/sec: 3395.60 - lr: 0.000039 - momentum: 0.000000
2023-10-25 21:40:11,607 ----------------------------------------------------------------------------------------------------
2023-10-25 21:40:11,607 EPOCH 3 done: loss 0.0699 - lr: 0.000039
2023-10-25 21:40:12,807 DEV : loss 0.11476627737283707 - f1-score (micro avg) 0.7796
2023-10-25 21:40:12,813 ----------------------------------------------------------------------------------------------------
2023-10-25 21:40:14,352 epoch 4 - iter 27/272 - loss 0.03266996 - time (sec): 1.54 - samples/sec: 3713.00 - lr: 0.000038 - momentum: 0.000000
2023-10-25 21:40:15,870 epoch 4 - iter 54/272 - loss 0.03739765 - time (sec): 3.06 - samples/sec: 3748.93 - lr: 0.000038 - momentum: 0.000000
2023-10-25 21:40:17,437 epoch 4 - iter 81/272 - loss 0.03454158 - time (sec): 4.62 - samples/sec: 3703.81 - lr: 0.000037 - momentum: 0.000000
2023-10-25 21:40:18,898 epoch 4 - iter 108/272 - loss 0.03836219 - time (sec): 6.08 - samples/sec: 3585.70 - lr: 0.000037 - momentum: 0.000000
2023-10-25 21:40:20,332 epoch 4 - iter 135/272 - loss 0.03954303 - time (sec): 7.52 - samples/sec: 3442.41 - lr: 0.000036 - momentum: 0.000000
2023-10-25 21:40:21,810 epoch 4 - iter 162/272 - loss 0.04116692 - time (sec): 9.00 - samples/sec: 3434.51 - lr: 0.000036 - momentum: 0.000000
2023-10-25 21:40:23,406 epoch 4 - iter 189/272 - loss 0.04364166 - time (sec): 10.59 - samples/sec: 3520.44 - lr: 0.000035 - momentum: 0.000000
2023-10-25 21:40:24,868 epoch 4 - iter 216/272 - loss 0.04463442 - time (sec): 12.05 - samples/sec: 3491.56 - lr: 0.000034 - momentum: 0.000000
2023-10-25 21:40:26,286 epoch 4 - iter 243/272 - loss 0.04378423 - time (sec): 13.47 - samples/sec: 3458.50 - lr: 0.000034 - momentum: 0.000000
2023-10-25 21:40:27,719 epoch 4 - iter 270/272 - loss 0.04515536 - time (sec): 14.90 - samples/sec: 3457.30 - lr: 0.000033 - momentum: 0.000000
2023-10-25 21:40:27,816 ----------------------------------------------------------------------------------------------------
2023-10-25 21:40:27,817 EPOCH 4 done: loss 0.0448 - lr: 0.000033
2023-10-25 21:40:28,998 DEV : loss 0.16075612604618073 - f1-score (micro avg) 0.7817
2023-10-25 21:40:29,003 ----------------------------------------------------------------------------------------------------
2023-10-25 21:40:30,401 epoch 5 - iter 27/272 - loss 0.02593003 - time (sec): 1.40 - samples/sec: 3711.81 - lr: 0.000033 - momentum: 0.000000
2023-10-25 21:40:31,853 epoch 5 - iter 54/272 - loss 0.02124892 - time (sec): 2.85 - samples/sec: 3440.04 - lr: 0.000032 - momentum: 0.000000
2023-10-25 21:40:33,271 epoch 5 - iter 81/272 - loss 0.02821459 - time (sec): 4.27 - samples/sec: 3361.61 - lr: 0.000032 - momentum: 0.000000
2023-10-25 21:40:34,717 epoch 5 - iter 108/272 - loss 0.02765530 - time (sec): 5.71 - samples/sec: 3429.78 - lr: 0.000031 - momentum: 0.000000
2023-10-25 21:40:36,143 epoch 5 - iter 135/272 - loss 0.03611608 - time (sec): 7.14 - samples/sec: 3424.71 - lr: 0.000031 - momentum: 0.000000
2023-10-25 21:40:37,651 epoch 5 - iter 162/272 - loss 0.03369221 - time (sec): 8.65 - samples/sec: 3535.61 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:40:39,147 epoch 5 - iter 189/272 - loss 0.03139247 - time (sec): 10.14 - samples/sec: 3589.30 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:40:40,625 epoch 5 - iter 216/272 - loss 0.03358985 - time (sec): 11.62 - samples/sec: 3586.62 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:40:42,089 epoch 5 - iter 243/272 - loss 0.03662728 - time (sec): 13.08 - samples/sec: 3532.03 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:40:43,560 epoch 5 - iter 270/272 - loss 0.03537001 - time (sec): 14.56 - samples/sec: 3560.81 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:40:43,665 ----------------------------------------------------------------------------------------------------
2023-10-25 21:40:43,666 EPOCH 5 done: loss 0.0353 - lr: 0.000028
2023-10-25 21:40:44,800 DEV : loss 0.16380402445793152 - f1-score (micro avg) 0.7751
2023-10-25 21:40:44,805 ----------------------------------------------------------------------------------------------------
2023-10-25 21:40:46,342 epoch 6 - iter 27/272 - loss 0.01481133 - time (sec): 1.54 - samples/sec: 3732.48 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:40:47,863 epoch 6 - iter 54/272 - loss 0.01850367 - time (sec): 3.06 - samples/sec: 3695.05 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:40:49,311 epoch 6 - iter 81/272 - loss 0.01883298 - time (sec): 4.50 - samples/sec: 3565.15 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:40:50,830 epoch 6 - iter 108/272 - loss 0.01882319 - time (sec): 6.02 - samples/sec: 3476.29 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:40:52,286 epoch 6 - iter 135/272 - loss 0.02367874 - time (sec): 7.48 - samples/sec: 3414.32 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:40:53,800 epoch 6 - iter 162/272 - loss 0.02132550 - time (sec): 8.99 - samples/sec: 3441.34 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:40:55,301 epoch 6 - iter 189/272 - loss 0.02360473 - time (sec): 10.49 - samples/sec: 3515.45 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:40:57,222 epoch 6 - iter 216/272 - loss 0.02161557 - time (sec): 12.42 - samples/sec: 3396.78 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:40:58,712 epoch 6 - iter 243/272 - loss 0.02278553 - time (sec): 13.90 - samples/sec: 3387.42 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:41:00,134 epoch 6 - iter 270/272 - loss 0.02113688 - time (sec): 15.33 - samples/sec: 3374.35 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:41:00,245 ----------------------------------------------------------------------------------------------------
2023-10-25 21:41:00,245 EPOCH 6 done: loss 0.0210 - lr: 0.000022
2023-10-25 21:41:01,550 DEV : loss 0.1769583821296692 - f1-score (micro avg) 0.7993
2023-10-25 21:41:01,557 saving best model
2023-10-25 21:41:02,266 ----------------------------------------------------------------------------------------------------
2023-10-25 21:41:03,785 epoch 7 - iter 27/272 - loss 0.02688400 - time (sec): 1.52 - samples/sec: 2915.87 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:41:05,247 epoch 7 - iter 54/272 - loss 0.02234700 - time (sec): 2.98 - samples/sec: 3064.03 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:41:06,737 epoch 7 - iter 81/272 - loss 0.01669336 - time (sec): 4.47 - samples/sec: 3160.69 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:41:08,276 epoch 7 - iter 108/272 - loss 0.01645342 - time (sec): 6.01 - samples/sec: 3273.87 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:41:09,728 epoch 7 - iter 135/272 - loss 0.01547215 - time (sec): 7.46 - samples/sec: 3279.73 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:41:11,200 epoch 7 - iter 162/272 - loss 0.01717666 - time (sec): 8.93 - samples/sec: 3344.47 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:41:12,708 epoch 7 - iter 189/272 - loss 0.01813691 - time (sec): 10.44 - samples/sec: 3405.98 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:41:14,193 epoch 7 - iter 216/272 - loss 0.01828301 - time (sec): 11.92 - samples/sec: 3383.23 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:41:15,699 epoch 7 - iter 243/272 - loss 0.01781809 - time (sec): 13.43 - samples/sec: 3424.57 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:41:17,291 epoch 7 - iter 270/272 - loss 0.01696126 - time (sec): 15.02 - samples/sec: 3431.39 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:41:17,416 ----------------------------------------------------------------------------------------------------
2023-10-25 21:41:17,417 EPOCH 7 done: loss 0.0169 - lr: 0.000017
2023-10-25 21:41:18,575 DEV : loss 0.19760426878929138 - f1-score (micro avg) 0.7934
2023-10-25 21:41:18,582 ----------------------------------------------------------------------------------------------------
2023-10-25 21:41:20,040 epoch 8 - iter 27/272 - loss 0.00228138 - time (sec): 1.46 - samples/sec: 3302.28 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:41:21,592 epoch 8 - iter 54/272 - loss 0.00949457 - time (sec): 3.01 - samples/sec: 3441.69 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:41:23,128 epoch 8 - iter 81/272 - loss 0.01336783 - time (sec): 4.54 - samples/sec: 3542.85 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:41:24,674 epoch 8 - iter 108/272 - loss 0.01437311 - time (sec): 6.09 - samples/sec: 3499.09 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:41:26,109 epoch 8 - iter 135/272 - loss 0.01210584 - time (sec): 7.53 - samples/sec: 3418.67 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:41:27,576 epoch 8 - iter 162/272 - loss 0.01394525 - time (sec): 8.99 - samples/sec: 3509.76 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:41:28,942 epoch 8 - iter 189/272 - loss 0.01282887 - time (sec): 10.36 - samples/sec: 3529.47 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:41:30,391 epoch 8 - iter 216/272 - loss 0.01369915 - time (sec): 11.81 - samples/sec: 3558.74 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:41:31,813 epoch 8 - iter 243/272 - loss 0.01272860 - time (sec): 13.23 - samples/sec: 3522.68 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:41:33,284 epoch 8 - iter 270/272 - loss 0.01228205 - time (sec): 14.70 - samples/sec: 3523.27 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:41:33,379 ----------------------------------------------------------------------------------------------------
2023-10-25 21:41:33,379 EPOCH 8 done: loss 0.0123 - lr: 0.000011
2023-10-25 21:41:34,601 DEV : loss 0.18747395277023315 - f1-score (micro avg) 0.8051
2023-10-25 21:41:34,608 saving best model
2023-10-25 21:41:35,363 ----------------------------------------------------------------------------------------------------
2023-10-25 21:41:36,857 epoch 9 - iter 27/272 - loss 0.00231539 - time (sec): 1.49 - samples/sec: 3274.11 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:41:38,266 epoch 9 - iter 54/272 - loss 0.00500163 - time (sec): 2.90 - samples/sec: 3069.42 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:41:39,718 epoch 9 - iter 81/272 - loss 0.00555753 - time (sec): 4.35 - samples/sec: 3291.06 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:41:41,176 epoch 9 - iter 108/272 - loss 0.00495039 - time (sec): 5.81 - samples/sec: 3300.73 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:41:42,740 epoch 9 - iter 135/272 - loss 0.00560261 - time (sec): 7.38 - samples/sec: 3353.92 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:41:44,268 epoch 9 - iter 162/272 - loss 0.00590316 - time (sec): 8.90 - samples/sec: 3494.87 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:41:45,803 epoch 9 - iter 189/272 - loss 0.00586282 - time (sec): 10.44 - samples/sec: 3518.11 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:41:47,368 epoch 9 - iter 216/272 - loss 0.00624663 - time (sec): 12.00 - samples/sec: 3538.85 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:41:48,868 epoch 9 - iter 243/272 - loss 0.00782273 - time (sec): 13.50 - samples/sec: 3556.05 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:41:50,201 epoch 9 - iter 270/272 - loss 0.00780728 - time (sec): 14.84 - samples/sec: 3480.92 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:41:50,302 ----------------------------------------------------------------------------------------------------
2023-10-25 21:41:50,302 EPOCH 9 done: loss 0.0078 - lr: 0.000006
2023-10-25 21:41:51,457 DEV : loss 0.1823568344116211 - f1-score (micro avg) 0.8073
2023-10-25 21:41:51,463 saving best model
2023-10-25 21:41:52,196 ----------------------------------------------------------------------------------------------------
2023-10-25 21:41:53,706 epoch 10 - iter 27/272 - loss 0.00845377 - time (sec): 1.51 - samples/sec: 3165.39 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:41:55,521 epoch 10 - iter 54/272 - loss 0.00663455 - time (sec): 3.32 - samples/sec: 3013.55 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:41:56,962 epoch 10 - iter 81/272 - loss 0.00509540 - time (sec): 4.76 - samples/sec: 3295.57 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:41:58,360 epoch 10 - iter 108/272 - loss 0.00549684 - time (sec): 6.16 - samples/sec: 3232.74 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:41:59,792 epoch 10 - iter 135/272 - loss 0.00473920 - time (sec): 7.59 - samples/sec: 3308.63 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:42:01,228 epoch 10 - iter 162/272 - loss 0.00471395 - time (sec): 9.03 - samples/sec: 3339.99 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:42:02,675 epoch 10 - iter 189/272 - loss 0.00409399 - time (sec): 10.48 - samples/sec: 3387.44 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:42:04,112 epoch 10 - iter 216/272 - loss 0.00421195 - time (sec): 11.91 - samples/sec: 3445.37 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:42:05,525 epoch 10 - iter 243/272 - loss 0.00424710 - time (sec): 13.33 - samples/sec: 3470.75 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:42:07,001 epoch 10 - iter 270/272 - loss 0.00508973 - time (sec): 14.80 - samples/sec: 3507.30 - lr: 0.000000 - momentum: 0.000000
2023-10-25 21:42:07,089 ----------------------------------------------------------------------------------------------------
2023-10-25 21:42:07,089 EPOCH 10 done: loss 0.0051 - lr: 0.000000
2023-10-25 21:42:08,275 DEV : loss 0.18975241482257843 - f1-score (micro avg) 0.8022
2023-10-25 21:42:08,776 ----------------------------------------------------------------------------------------------------
2023-10-25 21:42:08,777 Loading model from best epoch ...
2023-10-25 21:42:10,690 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-25 21:42:12,589
Results:
- F-score (micro) 0.7927
- F-score (macro) 0.7586
- Accuracy 0.6734
By class:
precision recall f1-score support
LOC 0.8152 0.8910 0.8515 312
PER 0.6892 0.8317 0.7538 208
ORG 0.5957 0.5091 0.5490 55
HumanProd 0.7857 1.0000 0.8800 22
micro avg 0.7511 0.8392 0.7927 597
macro avg 0.7215 0.8080 0.7586 597
weighted avg 0.7500 0.8392 0.7906 597
2023-10-25 21:42:12,589 ----------------------------------------------------------------------------------------------------