|
2023-10-19 12:12:24,058 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:12:24,058 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-19 12:12:24,058 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:12:24,058 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-19 12:12:24,058 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:12:24,058 Train: 20847 sentences |
|
2023-10-19 12:12:24,058 (train_with_dev=False, train_with_test=False) |
|
2023-10-19 12:12:24,058 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:12:24,058 Training Params: |
|
2023-10-19 12:12:24,058 - learning_rate: "5e-05" |
|
2023-10-19 12:12:24,058 - mini_batch_size: "8" |
|
2023-10-19 12:12:24,058 - max_epochs: "10" |
|
2023-10-19 12:12:24,058 - shuffle: "True" |
|
2023-10-19 12:12:24,058 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:12:24,058 Plugins: |
|
2023-10-19 12:12:24,058 - TensorboardLogger |
|
2023-10-19 12:12:24,059 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-19 12:12:24,059 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:12:24,059 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-19 12:12:24,059 - metric: "('micro avg', 'f1-score')" |
|
2023-10-19 12:12:24,059 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:12:24,059 Computation: |
|
2023-10-19 12:12:24,059 - compute on device: cuda:0 |
|
2023-10-19 12:12:24,059 - embedding storage: none |
|
2023-10-19 12:12:24,059 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:12:24,059 Model training base path: "hmbench-newseye/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-19 12:12:24,059 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:12:24,059 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:12:24,059 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-19 12:12:30,245 epoch 1 - iter 260/2606 - loss 3.29215868 - time (sec): 6.19 - samples/sec: 6125.45 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 12:12:36,395 epoch 1 - iter 520/2606 - loss 2.54663351 - time (sec): 12.34 - samples/sec: 6036.79 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 12:12:42,287 epoch 1 - iter 780/2606 - loss 1.98000977 - time (sec): 18.23 - samples/sec: 5915.66 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 12:12:48,508 epoch 1 - iter 1040/2606 - loss 1.60194669 - time (sec): 24.45 - samples/sec: 5924.28 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 12:12:54,610 epoch 1 - iter 1300/2606 - loss 1.39106299 - time (sec): 30.55 - samples/sec: 5902.93 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 12:13:01,049 epoch 1 - iter 1560/2606 - loss 1.23459866 - time (sec): 36.99 - samples/sec: 5904.98 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 12:13:07,064 epoch 1 - iter 1820/2606 - loss 1.13126722 - time (sec): 43.00 - samples/sec: 5895.47 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 12:13:13,149 epoch 1 - iter 2080/2606 - loss 1.04840712 - time (sec): 49.09 - samples/sec: 5904.15 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 12:13:19,453 epoch 1 - iter 2340/2606 - loss 0.97142483 - time (sec): 55.39 - samples/sec: 5935.98 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 12:13:25,404 epoch 1 - iter 2600/2606 - loss 0.91090610 - time (sec): 61.34 - samples/sec: 5977.09 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-19 12:13:25,524 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:13:25,524 EPOCH 1 done: loss 0.9097 - lr: 0.000050 |
|
2023-10-19 12:13:27,771 DEV : loss 0.14083856344223022 - f1-score (micro avg) 0.049 |
|
2023-10-19 12:13:27,795 saving best model |
|
2023-10-19 12:13:27,827 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:13:33,910 epoch 2 - iter 260/2606 - loss 0.33750708 - time (sec): 6.08 - samples/sec: 6097.84 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 12:13:40,145 epoch 2 - iter 520/2606 - loss 0.34632728 - time (sec): 12.32 - samples/sec: 6177.17 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 12:13:46,357 epoch 2 - iter 780/2606 - loss 0.33738274 - time (sec): 18.53 - samples/sec: 6059.22 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-19 12:13:52,438 epoch 2 - iter 1040/2606 - loss 0.33995850 - time (sec): 24.61 - samples/sec: 6083.78 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-19 12:13:57,789 epoch 2 - iter 1300/2606 - loss 0.34055467 - time (sec): 29.96 - samples/sec: 6142.43 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-19 12:14:03,922 epoch 2 - iter 1560/2606 - loss 0.33755029 - time (sec): 36.09 - samples/sec: 6101.82 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-19 12:14:10,023 epoch 2 - iter 1820/2606 - loss 0.33322099 - time (sec): 42.19 - samples/sec: 6115.49 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-19 12:14:16,125 epoch 2 - iter 2080/2606 - loss 0.32900406 - time (sec): 48.30 - samples/sec: 6057.42 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-19 12:14:22,224 epoch 2 - iter 2340/2606 - loss 0.32781761 - time (sec): 54.40 - samples/sec: 6080.19 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 12:14:28,034 epoch 2 - iter 2600/2606 - loss 0.32378050 - time (sec): 60.21 - samples/sec: 6090.09 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-19 12:14:28,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:14:28,158 EPOCH 2 done: loss 0.3239 - lr: 0.000044 |
|
2023-10-19 12:14:33,294 DEV : loss 0.13072499632835388 - f1-score (micro avg) 0.2529 |
|
2023-10-19 12:14:33,317 saving best model |
|
2023-10-19 12:14:33,350 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:14:39,474 epoch 3 - iter 260/2606 - loss 0.25574137 - time (sec): 6.12 - samples/sec: 5781.84 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-19 12:14:45,559 epoch 3 - iter 520/2606 - loss 0.26774606 - time (sec): 12.21 - samples/sec: 5992.37 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-19 12:14:51,497 epoch 3 - iter 780/2606 - loss 0.26961024 - time (sec): 18.15 - samples/sec: 5774.06 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-19 12:14:57,569 epoch 3 - iter 1040/2606 - loss 0.26961308 - time (sec): 24.22 - samples/sec: 5862.06 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-19 12:15:03,380 epoch 3 - iter 1300/2606 - loss 0.27095396 - time (sec): 30.03 - samples/sec: 5966.12 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-19 12:15:09,637 epoch 3 - iter 1560/2606 - loss 0.27006005 - time (sec): 36.29 - samples/sec: 6027.28 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-19 12:15:15,650 epoch 3 - iter 1820/2606 - loss 0.26808602 - time (sec): 42.30 - samples/sec: 6043.76 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-19 12:15:22,109 epoch 3 - iter 2080/2606 - loss 0.26544069 - time (sec): 48.76 - samples/sec: 6025.06 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 12:15:28,256 epoch 3 - iter 2340/2606 - loss 0.26734771 - time (sec): 54.91 - samples/sec: 6006.07 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-19 12:15:34,479 epoch 3 - iter 2600/2606 - loss 0.26623336 - time (sec): 61.13 - samples/sec: 6001.01 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-19 12:15:34,625 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:15:34,625 EPOCH 3 done: loss 0.2661 - lr: 0.000039 |
|
2023-10-19 12:15:39,746 DEV : loss 0.14174337685108185 - f1-score (micro avg) 0.2721 |
|
2023-10-19 12:15:39,769 saving best model |
|
2023-10-19 12:15:39,801 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:15:45,896 epoch 4 - iter 260/2606 - loss 0.25139617 - time (sec): 6.09 - samples/sec: 6246.82 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-19 12:15:52,182 epoch 4 - iter 520/2606 - loss 0.23480749 - time (sec): 12.38 - samples/sec: 6268.11 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-19 12:15:58,276 epoch 4 - iter 780/2606 - loss 0.24352221 - time (sec): 18.47 - samples/sec: 6182.49 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-19 12:16:04,376 epoch 4 - iter 1040/2606 - loss 0.24831904 - time (sec): 24.57 - samples/sec: 6075.18 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-19 12:16:10,652 epoch 4 - iter 1300/2606 - loss 0.24218416 - time (sec): 30.85 - samples/sec: 6009.23 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-19 12:16:16,800 epoch 4 - iter 1560/2606 - loss 0.23924425 - time (sec): 37.00 - samples/sec: 5955.66 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-19 12:16:23,005 epoch 4 - iter 1820/2606 - loss 0.23697081 - time (sec): 43.20 - samples/sec: 5993.23 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 12:16:29,000 epoch 4 - iter 2080/2606 - loss 0.23564226 - time (sec): 49.20 - samples/sec: 5965.50 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-19 12:16:35,148 epoch 4 - iter 2340/2606 - loss 0.23533143 - time (sec): 55.35 - samples/sec: 5934.69 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-19 12:16:41,289 epoch 4 - iter 2600/2606 - loss 0.23376594 - time (sec): 61.49 - samples/sec: 5959.34 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-19 12:16:41,416 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:16:41,417 EPOCH 4 done: loss 0.2338 - lr: 0.000033 |
|
2023-10-19 12:16:46,618 DEV : loss 0.14498400688171387 - f1-score (micro avg) 0.2627 |
|
2023-10-19 12:16:46,641 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:16:52,681 epoch 5 - iter 260/2606 - loss 0.19023305 - time (sec): 6.04 - samples/sec: 5769.39 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-19 12:16:58,701 epoch 5 - iter 520/2606 - loss 0.21196067 - time (sec): 12.06 - samples/sec: 5855.59 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-19 12:17:04,898 epoch 5 - iter 780/2606 - loss 0.21458042 - time (sec): 18.26 - samples/sec: 5909.45 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-19 12:17:11,106 epoch 5 - iter 1040/2606 - loss 0.21431244 - time (sec): 24.46 - samples/sec: 5961.35 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-19 12:17:17,204 epoch 5 - iter 1300/2606 - loss 0.21745310 - time (sec): 30.56 - samples/sec: 5900.25 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-19 12:17:23,385 epoch 5 - iter 1560/2606 - loss 0.21540934 - time (sec): 36.74 - samples/sec: 5920.43 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 12:17:29,536 epoch 5 - iter 1820/2606 - loss 0.21605849 - time (sec): 42.89 - samples/sec: 5966.77 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 12:17:35,911 epoch 5 - iter 2080/2606 - loss 0.21388261 - time (sec): 49.27 - samples/sec: 5952.00 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 12:17:42,028 epoch 5 - iter 2340/2606 - loss 0.21150192 - time (sec): 55.39 - samples/sec: 5950.63 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 12:17:48,297 epoch 5 - iter 2600/2606 - loss 0.21236748 - time (sec): 61.65 - samples/sec: 5944.70 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 12:17:48,443 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:17:48,443 EPOCH 5 done: loss 0.2122 - lr: 0.000028 |
|
2023-10-19 12:17:53,633 DEV : loss 0.16542117297649384 - f1-score (micro avg) 0.2692 |
|
2023-10-19 12:17:53,657 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:17:59,754 epoch 6 - iter 260/2606 - loss 0.18992843 - time (sec): 6.10 - samples/sec: 5941.18 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 12:18:05,959 epoch 6 - iter 520/2606 - loss 0.19973136 - time (sec): 12.30 - samples/sec: 6061.48 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 12:18:11,983 epoch 6 - iter 780/2606 - loss 0.20044914 - time (sec): 18.33 - samples/sec: 6043.26 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 12:18:18,208 epoch 6 - iter 1040/2606 - loss 0.19582093 - time (sec): 24.55 - samples/sec: 6172.37 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 12:18:24,346 epoch 6 - iter 1300/2606 - loss 0.19555968 - time (sec): 30.69 - samples/sec: 6129.34 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 12:18:30,281 epoch 6 - iter 1560/2606 - loss 0.19819069 - time (sec): 36.62 - samples/sec: 6027.05 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 12:18:36,257 epoch 6 - iter 1820/2606 - loss 0.19310853 - time (sec): 42.60 - samples/sec: 6027.45 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 12:18:42,270 epoch 6 - iter 2080/2606 - loss 0.19523538 - time (sec): 48.61 - samples/sec: 6028.21 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 12:18:48,569 epoch 6 - iter 2340/2606 - loss 0.19586601 - time (sec): 54.91 - samples/sec: 6030.74 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 12:18:54,595 epoch 6 - iter 2600/2606 - loss 0.19577802 - time (sec): 60.94 - samples/sec: 6012.96 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 12:18:54,761 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:18:54,762 EPOCH 6 done: loss 0.1958 - lr: 0.000022 |
|
2023-10-19 12:19:00,041 DEV : loss 0.1602775603532791 - f1-score (micro avg) 0.287 |
|
2023-10-19 12:19:00,064 saving best model |
|
2023-10-19 12:19:00,100 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:19:06,390 epoch 7 - iter 260/2606 - loss 0.19220321 - time (sec): 6.29 - samples/sec: 5805.07 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 12:19:12,713 epoch 7 - iter 520/2606 - loss 0.17774848 - time (sec): 12.61 - samples/sec: 5762.84 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 12:19:18,869 epoch 7 - iter 780/2606 - loss 0.18234572 - time (sec): 18.77 - samples/sec: 5790.14 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 12:19:25,087 epoch 7 - iter 1040/2606 - loss 0.18561093 - time (sec): 24.99 - samples/sec: 5882.59 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 12:19:31,197 epoch 7 - iter 1300/2606 - loss 0.18840634 - time (sec): 31.10 - samples/sec: 5838.80 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 12:19:37,367 epoch 7 - iter 1560/2606 - loss 0.18474661 - time (sec): 37.27 - samples/sec: 5861.62 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 12:19:43,563 epoch 7 - iter 1820/2606 - loss 0.18465378 - time (sec): 43.46 - samples/sec: 5858.84 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 12:19:49,879 epoch 7 - iter 2080/2606 - loss 0.18296343 - time (sec): 49.78 - samples/sec: 5858.93 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 12:19:56,085 epoch 7 - iter 2340/2606 - loss 0.18362259 - time (sec): 55.98 - samples/sec: 5851.91 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 12:20:02,342 epoch 7 - iter 2600/2606 - loss 0.18183094 - time (sec): 62.24 - samples/sec: 5886.07 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 12:20:02,494 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:20:02,494 EPOCH 7 done: loss 0.1820 - lr: 0.000017 |
|
2023-10-19 12:20:07,043 DEV : loss 0.1663893163204193 - f1-score (micro avg) 0.2606 |
|
2023-10-19 12:20:07,067 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:20:13,480 epoch 8 - iter 260/2606 - loss 0.17327680 - time (sec): 6.41 - samples/sec: 5899.14 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 12:20:20,259 epoch 8 - iter 520/2606 - loss 0.17288993 - time (sec): 13.19 - samples/sec: 5772.22 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 12:20:26,391 epoch 8 - iter 780/2606 - loss 0.17832513 - time (sec): 19.32 - samples/sec: 5847.05 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 12:20:32,449 epoch 8 - iter 1040/2606 - loss 0.17895989 - time (sec): 25.38 - samples/sec: 5784.84 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 12:20:38,654 epoch 8 - iter 1300/2606 - loss 0.17337141 - time (sec): 31.59 - samples/sec: 5857.95 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 12:20:44,733 epoch 8 - iter 1560/2606 - loss 0.17298900 - time (sec): 37.67 - samples/sec: 5825.43 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 12:20:50,908 epoch 8 - iter 1820/2606 - loss 0.17399534 - time (sec): 43.84 - samples/sec: 5852.70 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 12:20:57,214 epoch 8 - iter 2080/2606 - loss 0.17397486 - time (sec): 50.15 - samples/sec: 5834.64 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 12:21:03,425 epoch 8 - iter 2340/2606 - loss 0.17437299 - time (sec): 56.36 - samples/sec: 5844.70 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 12:21:09,586 epoch 8 - iter 2600/2606 - loss 0.17324575 - time (sec): 62.52 - samples/sec: 5858.13 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 12:21:09,745 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:21:09,745 EPOCH 8 done: loss 0.1734 - lr: 0.000011 |
|
2023-10-19 12:21:14,256 DEV : loss 0.1701161116361618 - f1-score (micro avg) 0.2849 |
|
2023-10-19 12:21:14,281 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:21:20,114 epoch 9 - iter 260/2606 - loss 0.16285426 - time (sec): 5.83 - samples/sec: 6017.71 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 12:21:26,305 epoch 9 - iter 520/2606 - loss 0.15650374 - time (sec): 12.02 - samples/sec: 5899.13 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 12:21:32,497 epoch 9 - iter 780/2606 - loss 0.15289815 - time (sec): 18.22 - samples/sec: 5990.64 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 12:21:38,744 epoch 9 - iter 1040/2606 - loss 0.15983102 - time (sec): 24.46 - samples/sec: 5990.22 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 12:21:44,897 epoch 9 - iter 1300/2606 - loss 0.15901456 - time (sec): 30.62 - samples/sec: 6016.51 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 12:21:51,948 epoch 9 - iter 1560/2606 - loss 0.16225996 - time (sec): 37.67 - samples/sec: 5873.77 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 12:21:58,092 epoch 9 - iter 1820/2606 - loss 0.16300920 - time (sec): 43.81 - samples/sec: 5858.05 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 12:22:04,260 epoch 9 - iter 2080/2606 - loss 0.16260252 - time (sec): 49.98 - samples/sec: 5885.13 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 12:22:10,385 epoch 9 - iter 2340/2606 - loss 0.16370810 - time (sec): 56.10 - samples/sec: 5887.77 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 12:22:16,530 epoch 9 - iter 2600/2606 - loss 0.16412782 - time (sec): 62.25 - samples/sec: 5886.33 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 12:22:16,678 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:22:16,678 EPOCH 9 done: loss 0.1641 - lr: 0.000006 |
|
2023-10-19 12:22:21,175 DEV : loss 0.17923708260059357 - f1-score (micro avg) 0.2855 |
|
2023-10-19 12:22:21,199 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:22:27,204 epoch 10 - iter 260/2606 - loss 0.16832198 - time (sec): 6.00 - samples/sec: 5414.49 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 12:22:33,408 epoch 10 - iter 520/2606 - loss 0.15961189 - time (sec): 12.21 - samples/sec: 5784.32 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 12:22:39,261 epoch 10 - iter 780/2606 - loss 0.16066261 - time (sec): 18.06 - samples/sec: 5815.64 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 12:22:45,464 epoch 10 - iter 1040/2606 - loss 0.16755305 - time (sec): 24.26 - samples/sec: 5869.45 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 12:22:51,624 epoch 10 - iter 1300/2606 - loss 0.16288242 - time (sec): 30.43 - samples/sec: 5908.64 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 12:22:57,929 epoch 10 - iter 1560/2606 - loss 0.16326348 - time (sec): 36.73 - samples/sec: 5945.25 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 12:23:03,980 epoch 10 - iter 1820/2606 - loss 0.16190089 - time (sec): 42.78 - samples/sec: 5896.27 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 12:23:09,938 epoch 10 - iter 2080/2606 - loss 0.16132966 - time (sec): 48.74 - samples/sec: 5965.88 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 12:23:16,219 epoch 10 - iter 2340/2606 - loss 0.16166399 - time (sec): 55.02 - samples/sec: 5992.67 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 12:23:23,237 epoch 10 - iter 2600/2606 - loss 0.16180342 - time (sec): 62.04 - samples/sec: 5914.47 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 12:23:23,378 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:23:23,378 EPOCH 10 done: loss 0.1617 - lr: 0.000000 |
|
2023-10-19 12:23:27,891 DEV : loss 0.18292318284511566 - f1-score (micro avg) 0.2881 |
|
2023-10-19 12:23:27,916 saving best model |
|
2023-10-19 12:23:27,974 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:23:27,974 Loading model from best epoch ... |
|
2023-10-19 12:23:28,044 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-19 12:23:34,357 |
|
Results: |
|
- F-score (micro) 0.321 |
|
- F-score (macro) 0.182 |
|
- Accuracy 0.1932 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.4269 0.5288 0.4724 1214 |
|
PER 0.1741 0.2178 0.1935 808 |
|
ORG 0.0652 0.0595 0.0622 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.2957 0.3510 0.3210 2390 |
|
macro avg 0.1665 0.2015 0.1820 2390 |
|
weighted avg 0.2853 0.3510 0.3146 2390 |
|
|
|
2023-10-19 12:23:34,357 ---------------------------------------------------------------------------------------------------- |
|
|