|
2023-10-16 19:55:09,277 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:55:09,278 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 19:55:09,278 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:55:09,278 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-16 19:55:09,278 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:55:09,278 Train: 1085 sentences |
|
2023-10-16 19:55:09,278 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 19:55:09,278 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:55:09,278 Training Params: |
|
2023-10-16 19:55:09,278 - learning_rate: "3e-05" |
|
2023-10-16 19:55:09,278 - mini_batch_size: "4" |
|
2023-10-16 19:55:09,278 - max_epochs: "10" |
|
2023-10-16 19:55:09,278 - shuffle: "True" |
|
2023-10-16 19:55:09,278 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:55:09,278 Plugins: |
|
2023-10-16 19:55:09,278 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 19:55:09,278 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:55:09,278 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 19:55:09,278 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 19:55:09,279 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:55:09,279 Computation: |
|
2023-10-16 19:55:09,279 - compute on device: cuda:0 |
|
2023-10-16 19:55:09,279 - embedding storage: none |
|
2023-10-16 19:55:09,279 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:55:09,279 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-16 19:55:09,279 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:55:09,279 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:55:10,798 epoch 1 - iter 27/272 - loss 2.83234957 - time (sec): 1.52 - samples/sec: 3290.84 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 19:55:12,264 epoch 1 - iter 54/272 - loss 2.43687139 - time (sec): 2.98 - samples/sec: 3407.94 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 19:55:13,757 epoch 1 - iter 81/272 - loss 1.87521019 - time (sec): 4.48 - samples/sec: 3379.15 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 19:55:15,248 epoch 1 - iter 108/272 - loss 1.54195388 - time (sec): 5.97 - samples/sec: 3360.98 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 19:55:16,670 epoch 1 - iter 135/272 - loss 1.34849777 - time (sec): 7.39 - samples/sec: 3310.21 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 19:55:18,237 epoch 1 - iter 162/272 - loss 1.16896217 - time (sec): 8.96 - samples/sec: 3315.75 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 19:55:19,779 epoch 1 - iter 189/272 - loss 1.04849700 - time (sec): 10.50 - samples/sec: 3299.64 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 19:55:21,259 epoch 1 - iter 216/272 - loss 0.95395801 - time (sec): 11.98 - samples/sec: 3306.92 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 19:55:22,843 epoch 1 - iter 243/272 - loss 0.87116032 - time (sec): 13.56 - samples/sec: 3302.68 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 19:55:24,711 epoch 1 - iter 270/272 - loss 0.77807401 - time (sec): 15.43 - samples/sec: 3356.32 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 19:55:24,812 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:55:24,812 EPOCH 1 done: loss 0.7756 - lr: 0.000030 |
|
2023-10-16 19:55:25,806 DEV : loss 0.16201870143413544 - f1-score (micro avg) 0.5989 |
|
2023-10-16 19:55:25,810 saving best model |
|
2023-10-16 19:55:26,150 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:55:27,855 epoch 2 - iter 27/272 - loss 0.16691729 - time (sec): 1.70 - samples/sec: 3558.22 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 19:55:29,354 epoch 2 - iter 54/272 - loss 0.18391069 - time (sec): 3.20 - samples/sec: 3606.93 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 19:55:30,864 epoch 2 - iter 81/272 - loss 0.18206767 - time (sec): 4.71 - samples/sec: 3430.44 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 19:55:32,454 epoch 2 - iter 108/272 - loss 0.17658993 - time (sec): 6.30 - samples/sec: 3436.87 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 19:55:33,900 epoch 2 - iter 135/272 - loss 0.16868206 - time (sec): 7.75 - samples/sec: 3408.36 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 19:55:35,478 epoch 2 - iter 162/272 - loss 0.16939540 - time (sec): 9.33 - samples/sec: 3362.54 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 19:55:37,031 epoch 2 - iter 189/272 - loss 0.16409239 - time (sec): 10.88 - samples/sec: 3355.17 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 19:55:38,595 epoch 2 - iter 216/272 - loss 0.16010105 - time (sec): 12.44 - samples/sec: 3376.29 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 19:55:39,951 epoch 2 - iter 243/272 - loss 0.15825024 - time (sec): 13.80 - samples/sec: 3338.77 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 19:55:41,608 epoch 2 - iter 270/272 - loss 0.15185793 - time (sec): 15.46 - samples/sec: 3353.38 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 19:55:41,692 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:55:41,692 EPOCH 2 done: loss 0.1516 - lr: 0.000027 |
|
2023-10-16 19:55:43,108 DEV : loss 0.11580488085746765 - f1-score (micro avg) 0.784 |
|
2023-10-16 19:55:43,112 saving best model |
|
2023-10-16 19:55:43,564 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:55:45,007 epoch 3 - iter 27/272 - loss 0.08808373 - time (sec): 1.44 - samples/sec: 3170.05 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 19:55:46,648 epoch 3 - iter 54/272 - loss 0.09481473 - time (sec): 3.08 - samples/sec: 3413.77 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 19:55:48,227 epoch 3 - iter 81/272 - loss 0.10163387 - time (sec): 4.66 - samples/sec: 3434.61 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 19:55:49,795 epoch 3 - iter 108/272 - loss 0.09866374 - time (sec): 6.23 - samples/sec: 3487.16 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 19:55:51,436 epoch 3 - iter 135/272 - loss 0.09301795 - time (sec): 7.87 - samples/sec: 3426.99 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 19:55:53,033 epoch 3 - iter 162/272 - loss 0.09724660 - time (sec): 9.47 - samples/sec: 3404.26 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 19:55:54,679 epoch 3 - iter 189/272 - loss 0.09184127 - time (sec): 11.11 - samples/sec: 3375.12 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 19:55:56,293 epoch 3 - iter 216/272 - loss 0.09024370 - time (sec): 12.72 - samples/sec: 3366.13 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 19:55:57,711 epoch 3 - iter 243/272 - loss 0.08931322 - time (sec): 14.14 - samples/sec: 3314.10 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 19:55:59,194 epoch 3 - iter 270/272 - loss 0.08847740 - time (sec): 15.63 - samples/sec: 3318.50 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 19:55:59,280 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:55:59,280 EPOCH 3 done: loss 0.0888 - lr: 0.000023 |
|
2023-10-16 19:56:00,703 DEV : loss 0.10064025223255157 - f1-score (micro avg) 0.7821 |
|
2023-10-16 19:56:00,707 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:56:02,368 epoch 4 - iter 27/272 - loss 0.03703618 - time (sec): 1.66 - samples/sec: 3340.49 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 19:56:04,136 epoch 4 - iter 54/272 - loss 0.04293959 - time (sec): 3.43 - samples/sec: 3405.06 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 19:56:05,592 epoch 4 - iter 81/272 - loss 0.04695382 - time (sec): 4.88 - samples/sec: 3376.98 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 19:56:07,075 epoch 4 - iter 108/272 - loss 0.05388166 - time (sec): 6.37 - samples/sec: 3376.09 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 19:56:08,545 epoch 4 - iter 135/272 - loss 0.05558070 - time (sec): 7.84 - samples/sec: 3384.14 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 19:56:10,279 epoch 4 - iter 162/272 - loss 0.05373334 - time (sec): 9.57 - samples/sec: 3268.66 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 19:56:11,860 epoch 4 - iter 189/272 - loss 0.05246304 - time (sec): 11.15 - samples/sec: 3309.54 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 19:56:13,335 epoch 4 - iter 216/272 - loss 0.05390096 - time (sec): 12.63 - samples/sec: 3265.12 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 19:56:14,867 epoch 4 - iter 243/272 - loss 0.05448464 - time (sec): 14.16 - samples/sec: 3269.11 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 19:56:16,429 epoch 4 - iter 270/272 - loss 0.05501421 - time (sec): 15.72 - samples/sec: 3274.45 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 19:56:16,551 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:56:16,551 EPOCH 4 done: loss 0.0547 - lr: 0.000020 |
|
2023-10-16 19:56:18,001 DEV : loss 0.11307715624570847 - f1-score (micro avg) 0.8199 |
|
2023-10-16 19:56:18,005 saving best model |
|
2023-10-16 19:56:18,442 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:56:19,980 epoch 5 - iter 27/272 - loss 0.04691907 - time (sec): 1.54 - samples/sec: 3349.16 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 19:56:21,422 epoch 5 - iter 54/272 - loss 0.04225722 - time (sec): 2.98 - samples/sec: 3282.75 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 19:56:22,868 epoch 5 - iter 81/272 - loss 0.04022072 - time (sec): 4.43 - samples/sec: 3258.89 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 19:56:24,349 epoch 5 - iter 108/272 - loss 0.03973764 - time (sec): 5.91 - samples/sec: 3269.21 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 19:56:25,917 epoch 5 - iter 135/272 - loss 0.03914369 - time (sec): 7.47 - samples/sec: 3244.34 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 19:56:27,596 epoch 5 - iter 162/272 - loss 0.03750230 - time (sec): 9.15 - samples/sec: 3243.35 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 19:56:29,363 epoch 5 - iter 189/272 - loss 0.03658967 - time (sec): 10.92 - samples/sec: 3281.76 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 19:56:30,973 epoch 5 - iter 216/272 - loss 0.03680208 - time (sec): 12.53 - samples/sec: 3305.68 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 19:56:32,478 epoch 5 - iter 243/272 - loss 0.03570734 - time (sec): 14.03 - samples/sec: 3283.66 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 19:56:34,142 epoch 5 - iter 270/272 - loss 0.03533262 - time (sec): 15.70 - samples/sec: 3287.98 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 19:56:34,276 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:56:34,276 EPOCH 5 done: loss 0.0353 - lr: 0.000017 |
|
2023-10-16 19:56:35,722 DEV : loss 0.12297820299863815 - f1-score (micro avg) 0.8154 |
|
2023-10-16 19:56:35,727 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:56:37,444 epoch 6 - iter 27/272 - loss 0.01991166 - time (sec): 1.72 - samples/sec: 3322.31 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 19:56:38,878 epoch 6 - iter 54/272 - loss 0.01908545 - time (sec): 3.15 - samples/sec: 3148.40 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 19:56:40,414 epoch 6 - iter 81/272 - loss 0.02105140 - time (sec): 4.69 - samples/sec: 3163.24 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 19:56:41,821 epoch 6 - iter 108/272 - loss 0.02398416 - time (sec): 6.09 - samples/sec: 3170.19 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 19:56:43,253 epoch 6 - iter 135/272 - loss 0.02455143 - time (sec): 7.53 - samples/sec: 3197.84 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 19:56:44,862 epoch 6 - iter 162/272 - loss 0.02537861 - time (sec): 9.13 - samples/sec: 3200.04 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 19:56:46,456 epoch 6 - iter 189/272 - loss 0.02638712 - time (sec): 10.73 - samples/sec: 3231.46 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 19:56:48,089 epoch 6 - iter 216/272 - loss 0.02536443 - time (sec): 12.36 - samples/sec: 3289.75 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 19:56:49,795 epoch 6 - iter 243/272 - loss 0.02486984 - time (sec): 14.07 - samples/sec: 3329.80 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 19:56:51,361 epoch 6 - iter 270/272 - loss 0.02422424 - time (sec): 15.63 - samples/sec: 3310.76 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 19:56:51,454 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:56:51,455 EPOCH 6 done: loss 0.0248 - lr: 0.000013 |
|
2023-10-16 19:56:52,874 DEV : loss 0.12834559381008148 - f1-score (micro avg) 0.8392 |
|
2023-10-16 19:56:52,878 saving best model |
|
2023-10-16 19:56:53,283 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:56:54,889 epoch 7 - iter 27/272 - loss 0.01509354 - time (sec): 1.60 - samples/sec: 3342.76 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 19:56:56,559 epoch 7 - iter 54/272 - loss 0.02142213 - time (sec): 3.27 - samples/sec: 3407.39 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 19:56:58,145 epoch 7 - iter 81/272 - loss 0.01832991 - time (sec): 4.86 - samples/sec: 3396.16 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 19:56:59,710 epoch 7 - iter 108/272 - loss 0.01959823 - time (sec): 6.42 - samples/sec: 3400.56 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 19:57:01,243 epoch 7 - iter 135/272 - loss 0.01831873 - time (sec): 7.95 - samples/sec: 3428.75 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 19:57:02,746 epoch 7 - iter 162/272 - loss 0.02072289 - time (sec): 9.46 - samples/sec: 3380.77 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 19:57:04,279 epoch 7 - iter 189/272 - loss 0.01977148 - time (sec): 10.99 - samples/sec: 3358.94 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 19:57:05,886 epoch 7 - iter 216/272 - loss 0.01911425 - time (sec): 12.60 - samples/sec: 3368.28 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 19:57:07,420 epoch 7 - iter 243/272 - loss 0.01880223 - time (sec): 14.13 - samples/sec: 3359.24 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 19:57:08,916 epoch 7 - iter 270/272 - loss 0.02047643 - time (sec): 15.63 - samples/sec: 3318.25 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 19:57:09,000 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:57:09,000 EPOCH 7 done: loss 0.0205 - lr: 0.000010 |
|
2023-10-16 19:57:10,600 DEV : loss 0.1370716243982315 - f1-score (micro avg) 0.8439 |
|
2023-10-16 19:57:10,604 saving best model |
|
2023-10-16 19:57:11,024 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:57:12,531 epoch 8 - iter 27/272 - loss 0.01570058 - time (sec): 1.51 - samples/sec: 3260.07 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 19:57:14,135 epoch 8 - iter 54/272 - loss 0.01138672 - time (sec): 3.11 - samples/sec: 3476.00 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 19:57:15,765 epoch 8 - iter 81/272 - loss 0.01524252 - time (sec): 4.74 - samples/sec: 3507.80 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 19:57:17,378 epoch 8 - iter 108/272 - loss 0.01459537 - time (sec): 6.35 - samples/sec: 3513.07 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 19:57:18,810 epoch 8 - iter 135/272 - loss 0.01560028 - time (sec): 7.78 - samples/sec: 3519.74 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 19:57:20,474 epoch 8 - iter 162/272 - loss 0.01486699 - time (sec): 9.45 - samples/sec: 3456.85 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 19:57:21,873 epoch 8 - iter 189/272 - loss 0.01430914 - time (sec): 10.85 - samples/sec: 3448.67 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 19:57:23,213 epoch 8 - iter 216/272 - loss 0.01524647 - time (sec): 12.19 - samples/sec: 3422.66 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 19:57:24,686 epoch 8 - iter 243/272 - loss 0.01493429 - time (sec): 13.66 - samples/sec: 3398.56 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 19:57:26,272 epoch 8 - iter 270/272 - loss 0.01498748 - time (sec): 15.25 - samples/sec: 3398.46 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 19:57:26,355 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:57:26,355 EPOCH 8 done: loss 0.0153 - lr: 0.000007 |
|
2023-10-16 19:57:27,779 DEV : loss 0.15467961132526398 - f1-score (micro avg) 0.8199 |
|
2023-10-16 19:57:27,783 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:57:29,287 epoch 9 - iter 27/272 - loss 0.02729236 - time (sec): 1.50 - samples/sec: 3297.39 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 19:57:31,004 epoch 9 - iter 54/272 - loss 0.01595842 - time (sec): 3.22 - samples/sec: 3355.44 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 19:57:32,462 epoch 9 - iter 81/272 - loss 0.01475055 - time (sec): 4.68 - samples/sec: 3375.42 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 19:57:34,011 epoch 9 - iter 108/272 - loss 0.01321516 - time (sec): 6.23 - samples/sec: 3437.62 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 19:57:35,432 epoch 9 - iter 135/272 - loss 0.01577781 - time (sec): 7.65 - samples/sec: 3366.56 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 19:57:37,001 epoch 9 - iter 162/272 - loss 0.01510896 - time (sec): 9.22 - samples/sec: 3434.11 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 19:57:38,374 epoch 9 - iter 189/272 - loss 0.01394347 - time (sec): 10.59 - samples/sec: 3375.39 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 19:57:39,957 epoch 9 - iter 216/272 - loss 0.01343107 - time (sec): 12.17 - samples/sec: 3371.56 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 19:57:41,408 epoch 9 - iter 243/272 - loss 0.01270919 - time (sec): 13.62 - samples/sec: 3347.79 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 19:57:43,155 epoch 9 - iter 270/272 - loss 0.01263245 - time (sec): 15.37 - samples/sec: 3366.15 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 19:57:43,246 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:57:43,246 EPOCH 9 done: loss 0.0126 - lr: 0.000003 |
|
2023-10-16 19:57:44,689 DEV : loss 0.15767242014408112 - f1-score (micro avg) 0.833 |
|
2023-10-16 19:57:44,694 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:57:46,340 epoch 10 - iter 27/272 - loss 0.00835601 - time (sec): 1.64 - samples/sec: 3466.33 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 19:57:47,783 epoch 10 - iter 54/272 - loss 0.00485101 - time (sec): 3.09 - samples/sec: 3283.50 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 19:57:49,267 epoch 10 - iter 81/272 - loss 0.00584821 - time (sec): 4.57 - samples/sec: 3248.55 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 19:57:50,925 epoch 10 - iter 108/272 - loss 0.00558808 - time (sec): 6.23 - samples/sec: 3275.67 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 19:57:52,542 epoch 10 - iter 135/272 - loss 0.00481861 - time (sec): 7.85 - samples/sec: 3322.11 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 19:57:54,193 epoch 10 - iter 162/272 - loss 0.00836834 - time (sec): 9.50 - samples/sec: 3313.17 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 19:57:55,900 epoch 10 - iter 189/272 - loss 0.00829871 - time (sec): 11.20 - samples/sec: 3295.75 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 19:57:57,412 epoch 10 - iter 216/272 - loss 0.00879782 - time (sec): 12.72 - samples/sec: 3265.60 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 19:57:58,914 epoch 10 - iter 243/272 - loss 0.00937661 - time (sec): 14.22 - samples/sec: 3261.54 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 19:58:00,543 epoch 10 - iter 270/272 - loss 0.00993645 - time (sec): 15.85 - samples/sec: 3265.36 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 19:58:00,639 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:58:00,640 EPOCH 10 done: loss 0.0099 - lr: 0.000000 |
|
2023-10-16 19:58:02,063 DEV : loss 0.16342675685882568 - f1-score (micro avg) 0.819 |
|
2023-10-16 19:58:02,408 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:58:02,409 Loading model from best epoch ... |
|
2023-10-16 19:58:04,004 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-16 19:58:05,970 |
|
Results: |
|
- F-score (micro) 0.7955 |
|
- F-score (macro) 0.7533 |
|
- Accuracy 0.6786 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8212 0.8686 0.8442 312 |
|
PER 0.7344 0.8510 0.7884 208 |
|
ORG 0.5000 0.4000 0.4444 55 |
|
HumanProd 0.8800 1.0000 0.9362 22 |
|
|
|
micro avg 0.7688 0.8241 0.7955 597 |
|
macro avg 0.7339 0.7799 0.7533 597 |
|
weighted avg 0.7636 0.8241 0.7913 597 |
|
|
|
2023-10-16 19:58:05,970 ---------------------------------------------------------------------------------------------------- |
|
|