|
2023-10-25 21:39:22,701 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:39:22,702 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 21:39:22,702 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:39:22,703 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-25 21:39:22,703 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:39:22,703 Train: 1085 sentences |
|
2023-10-25 21:39:22,703 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 21:39:22,703 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:39:22,703 Training Params: |
|
2023-10-25 21:39:22,703 - learning_rate: "5e-05" |
|
2023-10-25 21:39:22,703 - mini_batch_size: "4" |
|
2023-10-25 21:39:22,703 - max_epochs: "10" |
|
2023-10-25 21:39:22,703 - shuffle: "True" |
|
2023-10-25 21:39:22,703 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:39:22,703 Plugins: |
|
2023-10-25 21:39:22,703 - TensorboardLogger |
|
2023-10-25 21:39:22,703 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 21:39:22,703 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:39:22,703 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 21:39:22,704 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 21:39:22,704 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:39:22,704 Computation: |
|
2023-10-25 21:39:22,704 - compute on device: cuda:0 |
|
2023-10-25 21:39:22,704 - embedding storage: none |
|
2023-10-25 21:39:22,704 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:39:22,704 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-25 21:39:22,704 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:39:22,704 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:39:22,704 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 21:39:24,155 epoch 1 - iter 27/272 - loss 2.49805446 - time (sec): 1.45 - samples/sec: 3540.00 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 21:39:25,675 epoch 1 - iter 54/272 - loss 1.67544873 - time (sec): 2.97 - samples/sec: 3496.77 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 21:39:27,155 epoch 1 - iter 81/272 - loss 1.26708392 - time (sec): 4.45 - samples/sec: 3630.39 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:39:28,663 epoch 1 - iter 108/272 - loss 1.04932330 - time (sec): 5.96 - samples/sec: 3687.73 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 21:39:30,181 epoch 1 - iter 135/272 - loss 0.90812403 - time (sec): 7.48 - samples/sec: 3600.75 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 21:39:31,693 epoch 1 - iter 162/272 - loss 0.80144087 - time (sec): 8.99 - samples/sec: 3564.49 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 21:39:33,167 epoch 1 - iter 189/272 - loss 0.71754388 - time (sec): 10.46 - samples/sec: 3591.83 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 21:39:34,661 epoch 1 - iter 216/272 - loss 0.66141360 - time (sec): 11.96 - samples/sec: 3525.30 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 21:39:36,195 epoch 1 - iter 243/272 - loss 0.60466834 - time (sec): 13.49 - samples/sec: 3541.07 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 21:39:37,648 epoch 1 - iter 270/272 - loss 0.57299630 - time (sec): 14.94 - samples/sec: 3462.38 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 21:39:37,745 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:39:37,745 EPOCH 1 done: loss 0.5710 - lr: 0.000049 |
|
2023-10-25 21:39:38,421 DEV : loss 0.1312190741300583 - f1-score (micro avg) 0.6734 |
|
2023-10-25 21:39:38,428 saving best model |
|
2023-10-25 21:39:38,939 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:39:40,412 epoch 2 - iter 27/272 - loss 0.13994969 - time (sec): 1.47 - samples/sec: 3542.86 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 21:39:41,895 epoch 2 - iter 54/272 - loss 0.14777439 - time (sec): 2.95 - samples/sec: 3635.29 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 21:39:43,400 epoch 2 - iter 81/272 - loss 0.14433581 - time (sec): 4.46 - samples/sec: 3437.04 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 21:39:44,891 epoch 2 - iter 108/272 - loss 0.13742883 - time (sec): 5.95 - samples/sec: 3398.38 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 21:39:46,405 epoch 2 - iter 135/272 - loss 0.12798837 - time (sec): 7.46 - samples/sec: 3425.00 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 21:39:47,911 epoch 2 - iter 162/272 - loss 0.13103800 - time (sec): 8.97 - samples/sec: 3424.63 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 21:39:49,449 epoch 2 - iter 189/272 - loss 0.13367852 - time (sec): 10.51 - samples/sec: 3472.65 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 21:39:50,888 epoch 2 - iter 216/272 - loss 0.13388695 - time (sec): 11.95 - samples/sec: 3419.50 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 21:39:52,414 epoch 2 - iter 243/272 - loss 0.13310624 - time (sec): 13.47 - samples/sec: 3458.75 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 21:39:53,891 epoch 2 - iter 270/272 - loss 0.12902925 - time (sec): 14.95 - samples/sec: 3448.09 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 21:39:54,014 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:39:54,015 EPOCH 2 done: loss 0.1288 - lr: 0.000045 |
|
2023-10-25 21:39:55,628 DEV : loss 0.11965033411979675 - f1-score (micro avg) 0.7882 |
|
2023-10-25 21:39:55,635 saving best model |
|
2023-10-25 21:39:56,316 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:39:57,805 epoch 3 - iter 27/272 - loss 0.07229947 - time (sec): 1.49 - samples/sec: 2928.53 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 21:39:59,248 epoch 3 - iter 54/272 - loss 0.07657749 - time (sec): 2.93 - samples/sec: 3221.53 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 21:40:00,792 epoch 3 - iter 81/272 - loss 0.06566703 - time (sec): 4.47 - samples/sec: 3350.84 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 21:40:02,256 epoch 3 - iter 108/272 - loss 0.07494847 - time (sec): 5.94 - samples/sec: 3275.73 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 21:40:03,887 epoch 3 - iter 135/272 - loss 0.06987444 - time (sec): 7.57 - samples/sec: 3372.36 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 21:40:05,534 epoch 3 - iter 162/272 - loss 0.06586629 - time (sec): 9.22 - samples/sec: 3397.13 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 21:40:07,052 epoch 3 - iter 189/272 - loss 0.06688391 - time (sec): 10.73 - samples/sec: 3437.33 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 21:40:08,532 epoch 3 - iter 216/272 - loss 0.06969988 - time (sec): 12.21 - samples/sec: 3392.47 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 21:40:10,001 epoch 3 - iter 243/272 - loss 0.06930302 - time (sec): 13.68 - samples/sec: 3371.69 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 21:40:11,506 epoch 3 - iter 270/272 - loss 0.06894899 - time (sec): 15.19 - samples/sec: 3395.60 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 21:40:11,607 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:40:11,607 EPOCH 3 done: loss 0.0699 - lr: 0.000039 |
|
2023-10-25 21:40:12,807 DEV : loss 0.11476627737283707 - f1-score (micro avg) 0.7796 |
|
2023-10-25 21:40:12,813 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:40:14,352 epoch 4 - iter 27/272 - loss 0.03266996 - time (sec): 1.54 - samples/sec: 3713.00 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 21:40:15,870 epoch 4 - iter 54/272 - loss 0.03739765 - time (sec): 3.06 - samples/sec: 3748.93 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 21:40:17,437 epoch 4 - iter 81/272 - loss 0.03454158 - time (sec): 4.62 - samples/sec: 3703.81 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 21:40:18,898 epoch 4 - iter 108/272 - loss 0.03836219 - time (sec): 6.08 - samples/sec: 3585.70 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 21:40:20,332 epoch 4 - iter 135/272 - loss 0.03954303 - time (sec): 7.52 - samples/sec: 3442.41 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 21:40:21,810 epoch 4 - iter 162/272 - loss 0.04116692 - time (sec): 9.00 - samples/sec: 3434.51 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 21:40:23,406 epoch 4 - iter 189/272 - loss 0.04364166 - time (sec): 10.59 - samples/sec: 3520.44 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 21:40:24,868 epoch 4 - iter 216/272 - loss 0.04463442 - time (sec): 12.05 - samples/sec: 3491.56 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 21:40:26,286 epoch 4 - iter 243/272 - loss 0.04378423 - time (sec): 13.47 - samples/sec: 3458.50 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 21:40:27,719 epoch 4 - iter 270/272 - loss 0.04515536 - time (sec): 14.90 - samples/sec: 3457.30 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 21:40:27,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:40:27,817 EPOCH 4 done: loss 0.0448 - lr: 0.000033 |
|
2023-10-25 21:40:28,998 DEV : loss 0.16075612604618073 - f1-score (micro avg) 0.7817 |
|
2023-10-25 21:40:29,003 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:40:30,401 epoch 5 - iter 27/272 - loss 0.02593003 - time (sec): 1.40 - samples/sec: 3711.81 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 21:40:31,853 epoch 5 - iter 54/272 - loss 0.02124892 - time (sec): 2.85 - samples/sec: 3440.04 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 21:40:33,271 epoch 5 - iter 81/272 - loss 0.02821459 - time (sec): 4.27 - samples/sec: 3361.61 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 21:40:34,717 epoch 5 - iter 108/272 - loss 0.02765530 - time (sec): 5.71 - samples/sec: 3429.78 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 21:40:36,143 epoch 5 - iter 135/272 - loss 0.03611608 - time (sec): 7.14 - samples/sec: 3424.71 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 21:40:37,651 epoch 5 - iter 162/272 - loss 0.03369221 - time (sec): 8.65 - samples/sec: 3535.61 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 21:40:39,147 epoch 5 - iter 189/272 - loss 0.03139247 - time (sec): 10.14 - samples/sec: 3589.30 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 21:40:40,625 epoch 5 - iter 216/272 - loss 0.03358985 - time (sec): 11.62 - samples/sec: 3586.62 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 21:40:42,089 epoch 5 - iter 243/272 - loss 0.03662728 - time (sec): 13.08 - samples/sec: 3532.03 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 21:40:43,560 epoch 5 - iter 270/272 - loss 0.03537001 - time (sec): 14.56 - samples/sec: 3560.81 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 21:40:43,665 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:40:43,666 EPOCH 5 done: loss 0.0353 - lr: 0.000028 |
|
2023-10-25 21:40:44,800 DEV : loss 0.16380402445793152 - f1-score (micro avg) 0.7751 |
|
2023-10-25 21:40:44,805 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:40:46,342 epoch 6 - iter 27/272 - loss 0.01481133 - time (sec): 1.54 - samples/sec: 3732.48 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 21:40:47,863 epoch 6 - iter 54/272 - loss 0.01850367 - time (sec): 3.06 - samples/sec: 3695.05 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 21:40:49,311 epoch 6 - iter 81/272 - loss 0.01883298 - time (sec): 4.50 - samples/sec: 3565.15 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 21:40:50,830 epoch 6 - iter 108/272 - loss 0.01882319 - time (sec): 6.02 - samples/sec: 3476.29 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 21:40:52,286 epoch 6 - iter 135/272 - loss 0.02367874 - time (sec): 7.48 - samples/sec: 3414.32 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 21:40:53,800 epoch 6 - iter 162/272 - loss 0.02132550 - time (sec): 8.99 - samples/sec: 3441.34 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 21:40:55,301 epoch 6 - iter 189/272 - loss 0.02360473 - time (sec): 10.49 - samples/sec: 3515.45 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 21:40:57,222 epoch 6 - iter 216/272 - loss 0.02161557 - time (sec): 12.42 - samples/sec: 3396.78 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 21:40:58,712 epoch 6 - iter 243/272 - loss 0.02278553 - time (sec): 13.90 - samples/sec: 3387.42 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 21:41:00,134 epoch 6 - iter 270/272 - loss 0.02113688 - time (sec): 15.33 - samples/sec: 3374.35 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 21:41:00,245 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:41:00,245 EPOCH 6 done: loss 0.0210 - lr: 0.000022 |
|
2023-10-25 21:41:01,550 DEV : loss 0.1769583821296692 - f1-score (micro avg) 0.7993 |
|
2023-10-25 21:41:01,557 saving best model |
|
2023-10-25 21:41:02,266 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:41:03,785 epoch 7 - iter 27/272 - loss 0.02688400 - time (sec): 1.52 - samples/sec: 2915.87 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 21:41:05,247 epoch 7 - iter 54/272 - loss 0.02234700 - time (sec): 2.98 - samples/sec: 3064.03 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 21:41:06,737 epoch 7 - iter 81/272 - loss 0.01669336 - time (sec): 4.47 - samples/sec: 3160.69 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 21:41:08,276 epoch 7 - iter 108/272 - loss 0.01645342 - time (sec): 6.01 - samples/sec: 3273.87 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 21:41:09,728 epoch 7 - iter 135/272 - loss 0.01547215 - time (sec): 7.46 - samples/sec: 3279.73 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 21:41:11,200 epoch 7 - iter 162/272 - loss 0.01717666 - time (sec): 8.93 - samples/sec: 3344.47 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 21:41:12,708 epoch 7 - iter 189/272 - loss 0.01813691 - time (sec): 10.44 - samples/sec: 3405.98 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 21:41:14,193 epoch 7 - iter 216/272 - loss 0.01828301 - time (sec): 11.92 - samples/sec: 3383.23 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 21:41:15,699 epoch 7 - iter 243/272 - loss 0.01781809 - time (sec): 13.43 - samples/sec: 3424.57 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 21:41:17,291 epoch 7 - iter 270/272 - loss 0.01696126 - time (sec): 15.02 - samples/sec: 3431.39 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 21:41:17,416 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:41:17,417 EPOCH 7 done: loss 0.0169 - lr: 0.000017 |
|
2023-10-25 21:41:18,575 DEV : loss 0.19760426878929138 - f1-score (micro avg) 0.7934 |
|
2023-10-25 21:41:18,582 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:41:20,040 epoch 8 - iter 27/272 - loss 0.00228138 - time (sec): 1.46 - samples/sec: 3302.28 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 21:41:21,592 epoch 8 - iter 54/272 - loss 0.00949457 - time (sec): 3.01 - samples/sec: 3441.69 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 21:41:23,128 epoch 8 - iter 81/272 - loss 0.01336783 - time (sec): 4.54 - samples/sec: 3542.85 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:41:24,674 epoch 8 - iter 108/272 - loss 0.01437311 - time (sec): 6.09 - samples/sec: 3499.09 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 21:41:26,109 epoch 8 - iter 135/272 - loss 0.01210584 - time (sec): 7.53 - samples/sec: 3418.67 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 21:41:27,576 epoch 8 - iter 162/272 - loss 0.01394525 - time (sec): 8.99 - samples/sec: 3509.76 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 21:41:28,942 epoch 8 - iter 189/272 - loss 0.01282887 - time (sec): 10.36 - samples/sec: 3529.47 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 21:41:30,391 epoch 8 - iter 216/272 - loss 0.01369915 - time (sec): 11.81 - samples/sec: 3558.74 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 21:41:31,813 epoch 8 - iter 243/272 - loss 0.01272860 - time (sec): 13.23 - samples/sec: 3522.68 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 21:41:33,284 epoch 8 - iter 270/272 - loss 0.01228205 - time (sec): 14.70 - samples/sec: 3523.27 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 21:41:33,379 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:41:33,379 EPOCH 8 done: loss 0.0123 - lr: 0.000011 |
|
2023-10-25 21:41:34,601 DEV : loss 0.18747395277023315 - f1-score (micro avg) 0.8051 |
|
2023-10-25 21:41:34,608 saving best model |
|
2023-10-25 21:41:35,363 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:41:36,857 epoch 9 - iter 27/272 - loss 0.00231539 - time (sec): 1.49 - samples/sec: 3274.11 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 21:41:38,266 epoch 9 - iter 54/272 - loss 0.00500163 - time (sec): 2.90 - samples/sec: 3069.42 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 21:41:39,718 epoch 9 - iter 81/272 - loss 0.00555753 - time (sec): 4.35 - samples/sec: 3291.06 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 21:41:41,176 epoch 9 - iter 108/272 - loss 0.00495039 - time (sec): 5.81 - samples/sec: 3300.73 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 21:41:42,740 epoch 9 - iter 135/272 - loss 0.00560261 - time (sec): 7.38 - samples/sec: 3353.92 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 21:41:44,268 epoch 9 - iter 162/272 - loss 0.00590316 - time (sec): 8.90 - samples/sec: 3494.87 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 21:41:45,803 epoch 9 - iter 189/272 - loss 0.00586282 - time (sec): 10.44 - samples/sec: 3518.11 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 21:41:47,368 epoch 9 - iter 216/272 - loss 0.00624663 - time (sec): 12.00 - samples/sec: 3538.85 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 21:41:48,868 epoch 9 - iter 243/272 - loss 0.00782273 - time (sec): 13.50 - samples/sec: 3556.05 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:41:50,201 epoch 9 - iter 270/272 - loss 0.00780728 - time (sec): 14.84 - samples/sec: 3480.92 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:41:50,302 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:41:50,302 EPOCH 9 done: loss 0.0078 - lr: 0.000006 |
|
2023-10-25 21:41:51,457 DEV : loss 0.1823568344116211 - f1-score (micro avg) 0.8073 |
|
2023-10-25 21:41:51,463 saving best model |
|
2023-10-25 21:41:52,196 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:41:53,706 epoch 10 - iter 27/272 - loss 0.00845377 - time (sec): 1.51 - samples/sec: 3165.39 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 21:41:55,521 epoch 10 - iter 54/272 - loss 0.00663455 - time (sec): 3.32 - samples/sec: 3013.55 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 21:41:56,962 epoch 10 - iter 81/272 - loss 0.00509540 - time (sec): 4.76 - samples/sec: 3295.57 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 21:41:58,360 epoch 10 - iter 108/272 - loss 0.00549684 - time (sec): 6.16 - samples/sec: 3232.74 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 21:41:59,792 epoch 10 - iter 135/272 - loss 0.00473920 - time (sec): 7.59 - samples/sec: 3308.63 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 21:42:01,228 epoch 10 - iter 162/272 - loss 0.00471395 - time (sec): 9.03 - samples/sec: 3339.99 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 21:42:02,675 epoch 10 - iter 189/272 - loss 0.00409399 - time (sec): 10.48 - samples/sec: 3387.44 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 21:42:04,112 epoch 10 - iter 216/272 - loss 0.00421195 - time (sec): 11.91 - samples/sec: 3445.37 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 21:42:05,525 epoch 10 - iter 243/272 - loss 0.00424710 - time (sec): 13.33 - samples/sec: 3470.75 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 21:42:07,001 epoch 10 - iter 270/272 - loss 0.00508973 - time (sec): 14.80 - samples/sec: 3507.30 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 21:42:07,089 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:42:07,089 EPOCH 10 done: loss 0.0051 - lr: 0.000000 |
|
2023-10-25 21:42:08,275 DEV : loss 0.18975241482257843 - f1-score (micro avg) 0.8022 |
|
2023-10-25 21:42:08,776 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:42:08,777 Loading model from best epoch ... |
|
2023-10-25 21:42:10,690 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-25 21:42:12,589 |
|
Results: |
|
- F-score (micro) 0.7927 |
|
- F-score (macro) 0.7586 |
|
- Accuracy 0.6734 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8152 0.8910 0.8515 312 |
|
PER 0.6892 0.8317 0.7538 208 |
|
ORG 0.5957 0.5091 0.5490 55 |
|
HumanProd 0.7857 1.0000 0.8800 22 |
|
|
|
micro avg 0.7511 0.8392 0.7927 597 |
|
macro avg 0.7215 0.8080 0.7586 597 |
|
weighted avg 0.7500 0.8392 0.7906 597 |
|
|
|
2023-10-25 21:42:12,589 ---------------------------------------------------------------------------------------------------- |
|
|