|
2023-10-19 12:53:40,565 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:53:40,566 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-19 12:53:40,566 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:53:40,566 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-19 12:53:40,566 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:53:40,566 Train: 20847 sentences |
|
2023-10-19 12:53:40,566 (train_with_dev=False, train_with_test=False) |
|
2023-10-19 12:53:40,566 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:53:40,566 Training Params: |
|
2023-10-19 12:53:40,566 - learning_rate: "3e-05" |
|
2023-10-19 12:53:40,566 - mini_batch_size: "8" |
|
2023-10-19 12:53:40,566 - max_epochs: "10" |
|
2023-10-19 12:53:40,566 - shuffle: "True" |
|
2023-10-19 12:53:40,566 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:53:40,566 Plugins: |
|
2023-10-19 12:53:40,566 - TensorboardLogger |
|
2023-10-19 12:53:40,566 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-19 12:53:40,566 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:53:40,566 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-19 12:53:40,566 - metric: "('micro avg', 'f1-score')" |
|
2023-10-19 12:53:40,566 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:53:40,566 Computation: |
|
2023-10-19 12:53:40,566 - compute on device: cuda:0 |
|
2023-10-19 12:53:40,566 - embedding storage: none |
|
2023-10-19 12:53:40,566 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:53:40,566 Model training base path: "hmbench-newseye/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-19 12:53:40,566 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:53:40,566 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:53:40,567 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-19 12:53:46,994 epoch 1 - iter 260/2606 - loss 3.67937496 - time (sec): 6.43 - samples/sec: 5980.08 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 12:53:53,209 epoch 1 - iter 520/2606 - loss 3.24386293 - time (sec): 12.64 - samples/sec: 5849.82 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 12:53:59,467 epoch 1 - iter 780/2606 - loss 2.63630803 - time (sec): 18.90 - samples/sec: 5731.33 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 12:54:05,713 epoch 1 - iter 1040/2606 - loss 2.12598618 - time (sec): 25.15 - samples/sec: 5811.19 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 12:54:11,767 epoch 1 - iter 1300/2606 - loss 1.82710241 - time (sec): 31.20 - samples/sec: 5795.87 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 12:54:18,054 epoch 1 - iter 1560/2606 - loss 1.59512691 - time (sec): 37.49 - samples/sec: 5856.66 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 12:54:24,290 epoch 1 - iter 1820/2606 - loss 1.42908655 - time (sec): 43.72 - samples/sec: 5894.01 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 12:54:30,228 epoch 1 - iter 2080/2606 - loss 1.31140792 - time (sec): 49.66 - samples/sec: 5890.34 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 12:54:36,425 epoch 1 - iter 2340/2606 - loss 1.21246650 - time (sec): 55.86 - samples/sec: 5913.27 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 12:54:42,693 epoch 1 - iter 2600/2606 - loss 1.13822888 - time (sec): 62.13 - samples/sec: 5898.17 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 12:54:42,850 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:54:42,851 EPOCH 1 done: loss 1.1361 - lr: 0.000030 |
|
2023-10-19 12:54:45,044 DEV : loss 0.1448792666196823 - f1-score (micro avg) 0.0 |
|
2023-10-19 12:54:45,068 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:54:51,174 epoch 2 - iter 260/2606 - loss 0.39574742 - time (sec): 6.11 - samples/sec: 6139.86 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 12:54:57,497 epoch 2 - iter 520/2606 - loss 0.38170985 - time (sec): 12.43 - samples/sec: 6241.58 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 12:55:03,664 epoch 2 - iter 780/2606 - loss 0.37268726 - time (sec): 18.60 - samples/sec: 6194.28 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 12:55:09,619 epoch 2 - iter 1040/2606 - loss 0.37326784 - time (sec): 24.55 - samples/sec: 6081.96 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 12:55:15,764 epoch 2 - iter 1300/2606 - loss 0.37596279 - time (sec): 30.70 - samples/sec: 6065.98 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 12:55:21,943 epoch 2 - iter 1560/2606 - loss 0.37321605 - time (sec): 36.88 - samples/sec: 6044.47 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 12:55:28,042 epoch 2 - iter 1820/2606 - loss 0.37172397 - time (sec): 42.97 - samples/sec: 6017.42 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 12:55:34,294 epoch 2 - iter 2080/2606 - loss 0.36866980 - time (sec): 49.23 - samples/sec: 5970.49 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 12:55:40,394 epoch 2 - iter 2340/2606 - loss 0.36454602 - time (sec): 55.33 - samples/sec: 5937.57 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 12:55:46,633 epoch 2 - iter 2600/2606 - loss 0.35760569 - time (sec): 61.57 - samples/sec: 5950.92 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 12:55:46,778 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:55:46,778 EPOCH 2 done: loss 0.3576 - lr: 0.000027 |
|
2023-10-19 12:55:51,945 DEV : loss 0.13413937389850616 - f1-score (micro avg) 0.2467 |
|
2023-10-19 12:55:51,969 saving best model |
|
2023-10-19 12:55:52,001 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:55:57,974 epoch 3 - iter 260/2606 - loss 0.32191777 - time (sec): 5.97 - samples/sec: 6167.93 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 12:56:04,137 epoch 3 - iter 520/2606 - loss 0.30596009 - time (sec): 12.14 - samples/sec: 6348.21 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 12:56:10,301 epoch 3 - iter 780/2606 - loss 0.31758928 - time (sec): 18.30 - samples/sec: 6103.89 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 12:56:16,432 epoch 3 - iter 1040/2606 - loss 0.32150981 - time (sec): 24.43 - samples/sec: 5994.76 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 12:56:22,767 epoch 3 - iter 1300/2606 - loss 0.31244734 - time (sec): 30.77 - samples/sec: 5976.99 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 12:56:29,301 epoch 3 - iter 1560/2606 - loss 0.31178616 - time (sec): 37.30 - samples/sec: 6026.29 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 12:56:35,320 epoch 3 - iter 1820/2606 - loss 0.30802277 - time (sec): 43.32 - samples/sec: 5936.44 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 12:56:41,531 epoch 3 - iter 2080/2606 - loss 0.30678566 - time (sec): 49.53 - samples/sec: 5884.82 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 12:56:47,917 epoch 3 - iter 2340/2606 - loss 0.30562571 - time (sec): 55.92 - samples/sec: 5920.29 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 12:56:54,023 epoch 3 - iter 2600/2606 - loss 0.30437315 - time (sec): 62.02 - samples/sec: 5915.34 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 12:56:54,166 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:56:54,166 EPOCH 3 done: loss 0.3045 - lr: 0.000023 |
|
2023-10-19 12:56:59,322 DEV : loss 0.13821715116500854 - f1-score (micro avg) 0.265 |
|
2023-10-19 12:56:59,346 saving best model |
|
2023-10-19 12:56:59,381 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:57:05,902 epoch 4 - iter 260/2606 - loss 0.27188149 - time (sec): 6.52 - samples/sec: 5800.89 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 12:57:12,009 epoch 4 - iter 520/2606 - loss 0.28019343 - time (sec): 12.63 - samples/sec: 5807.42 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 12:57:18,232 epoch 4 - iter 780/2606 - loss 0.27723607 - time (sec): 18.85 - samples/sec: 5892.61 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 12:57:24,366 epoch 4 - iter 1040/2606 - loss 0.28685163 - time (sec): 24.98 - samples/sec: 5841.93 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 12:57:30,457 epoch 4 - iter 1300/2606 - loss 0.28787540 - time (sec): 31.08 - samples/sec: 5867.35 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 12:57:36,751 epoch 4 - iter 1560/2606 - loss 0.28010882 - time (sec): 37.37 - samples/sec: 5844.72 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 12:57:42,936 epoch 4 - iter 1820/2606 - loss 0.27843312 - time (sec): 43.55 - samples/sec: 5839.14 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 12:57:49,105 epoch 4 - iter 2080/2606 - loss 0.27479089 - time (sec): 49.72 - samples/sec: 5864.80 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 12:57:55,303 epoch 4 - iter 2340/2606 - loss 0.27732640 - time (sec): 55.92 - samples/sec: 5902.31 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 12:58:01,376 epoch 4 - iter 2600/2606 - loss 0.27630985 - time (sec): 61.99 - samples/sec: 5914.71 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 12:58:01,525 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:58:01,525 EPOCH 4 done: loss 0.2761 - lr: 0.000020 |
|
2023-10-19 12:58:06,639 DEV : loss 0.1368006467819214 - f1-score (micro avg) 0.2812 |
|
2023-10-19 12:58:06,663 saving best model |
|
2023-10-19 12:58:06,696 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:58:12,745 epoch 5 - iter 260/2606 - loss 0.27544205 - time (sec): 6.05 - samples/sec: 5353.26 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 12:58:19,253 epoch 5 - iter 520/2606 - loss 0.27048322 - time (sec): 12.56 - samples/sec: 5822.74 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 12:58:25,185 epoch 5 - iter 780/2606 - loss 0.27499205 - time (sec): 18.49 - samples/sec: 5788.72 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 12:58:31,422 epoch 5 - iter 1040/2606 - loss 0.27262309 - time (sec): 24.73 - samples/sec: 5895.36 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 12:58:37,532 epoch 5 - iter 1300/2606 - loss 0.26767955 - time (sec): 30.84 - samples/sec: 5866.00 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 12:58:43,730 epoch 5 - iter 1560/2606 - loss 0.26432049 - time (sec): 37.03 - samples/sec: 5903.47 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 12:58:49,766 epoch 5 - iter 1820/2606 - loss 0.26184659 - time (sec): 43.07 - samples/sec: 5923.72 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 12:58:55,919 epoch 5 - iter 2080/2606 - loss 0.25841654 - time (sec): 49.22 - samples/sec: 5905.14 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 12:59:02,104 epoch 5 - iter 2340/2606 - loss 0.25700355 - time (sec): 55.41 - samples/sec: 5943.10 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 12:59:08,225 epoch 5 - iter 2600/2606 - loss 0.25425119 - time (sec): 61.53 - samples/sec: 5946.03 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 12:59:08,378 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:59:08,378 EPOCH 5 done: loss 0.2537 - lr: 0.000017 |
|
2023-10-19 12:59:13,542 DEV : loss 0.1384463608264923 - f1-score (micro avg) 0.258 |
|
2023-10-19 12:59:13,565 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:59:19,608 epoch 6 - iter 260/2606 - loss 0.24410473 - time (sec): 6.04 - samples/sec: 5843.56 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 12:59:25,809 epoch 6 - iter 520/2606 - loss 0.24055516 - time (sec): 12.24 - samples/sec: 5858.59 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 12:59:32,033 epoch 6 - iter 780/2606 - loss 0.24493204 - time (sec): 18.47 - samples/sec: 5812.66 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 12:59:38,165 epoch 6 - iter 1040/2606 - loss 0.24284525 - time (sec): 24.60 - samples/sec: 5874.06 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 12:59:44,277 epoch 6 - iter 1300/2606 - loss 0.24414411 - time (sec): 30.71 - samples/sec: 5884.75 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 12:59:50,455 epoch 6 - iter 1560/2606 - loss 0.24060126 - time (sec): 36.89 - samples/sec: 5951.34 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 12:59:56,650 epoch 6 - iter 1820/2606 - loss 0.24163513 - time (sec): 43.08 - samples/sec: 5961.14 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 13:00:02,781 epoch 6 - iter 2080/2606 - loss 0.23964372 - time (sec): 49.22 - samples/sec: 5933.99 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 13:00:09,004 epoch 6 - iter 2340/2606 - loss 0.24192637 - time (sec): 55.44 - samples/sec: 5923.76 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 13:00:15,418 epoch 6 - iter 2600/2606 - loss 0.23993975 - time (sec): 61.85 - samples/sec: 5929.96 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 13:00:15,546 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 13:00:15,546 EPOCH 6 done: loss 0.2400 - lr: 0.000013 |
|
2023-10-19 13:00:20,062 DEV : loss 0.14321103692054749 - f1-score (micro avg) 0.2734 |
|
2023-10-19 13:00:20,085 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 13:00:27,100 epoch 7 - iter 260/2606 - loss 0.24164308 - time (sec): 7.01 - samples/sec: 5259.63 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 13:00:33,329 epoch 7 - iter 520/2606 - loss 0.24625884 - time (sec): 13.24 - samples/sec: 5408.53 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 13:00:39,420 epoch 7 - iter 780/2606 - loss 0.24430191 - time (sec): 19.33 - samples/sec: 5562.82 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 13:00:45,575 epoch 7 - iter 1040/2606 - loss 0.23778745 - time (sec): 25.49 - samples/sec: 5687.54 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 13:00:51,698 epoch 7 - iter 1300/2606 - loss 0.23610972 - time (sec): 31.61 - samples/sec: 5791.14 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 13:00:57,716 epoch 7 - iter 1560/2606 - loss 0.23590207 - time (sec): 37.63 - samples/sec: 5821.94 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 13:01:03,875 epoch 7 - iter 1820/2606 - loss 0.23841444 - time (sec): 43.79 - samples/sec: 5901.75 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 13:01:09,484 epoch 7 - iter 2080/2606 - loss 0.23395377 - time (sec): 49.40 - samples/sec: 5942.72 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 13:01:15,618 epoch 7 - iter 2340/2606 - loss 0.23243325 - time (sec): 55.53 - samples/sec: 5942.11 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 13:01:21,873 epoch 7 - iter 2600/2606 - loss 0.23110482 - time (sec): 61.79 - samples/sec: 5927.51 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 13:01:22,031 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 13:01:22,031 EPOCH 7 done: loss 0.2309 - lr: 0.000010 |
|
2023-10-19 13:01:26,552 DEV : loss 0.1498628556728363 - f1-score (micro avg) 0.2672 |
|
2023-10-19 13:01:26,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 13:01:32,779 epoch 8 - iter 260/2606 - loss 0.20737514 - time (sec): 6.20 - samples/sec: 5909.35 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 13:01:39,130 epoch 8 - iter 520/2606 - loss 0.22682031 - time (sec): 12.55 - samples/sec: 6012.41 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 13:01:45,338 epoch 8 - iter 780/2606 - loss 0.23623211 - time (sec): 18.76 - samples/sec: 5965.33 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 13:01:51,502 epoch 8 - iter 1040/2606 - loss 0.22990084 - time (sec): 24.93 - samples/sec: 5955.94 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 13:01:58,319 epoch 8 - iter 1300/2606 - loss 0.22840380 - time (sec): 31.74 - samples/sec: 5776.62 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 13:02:04,394 epoch 8 - iter 1560/2606 - loss 0.22845898 - time (sec): 37.82 - samples/sec: 5805.05 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 13:02:10,485 epoch 8 - iter 1820/2606 - loss 0.22744141 - time (sec): 43.91 - samples/sec: 5850.60 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 13:02:16,447 epoch 8 - iter 2080/2606 - loss 0.22448400 - time (sec): 49.87 - samples/sec: 5849.57 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 13:02:22,494 epoch 8 - iter 2340/2606 - loss 0.22103987 - time (sec): 55.92 - samples/sec: 5876.35 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 13:02:28,591 epoch 8 - iter 2600/2606 - loss 0.22138364 - time (sec): 62.01 - samples/sec: 5907.34 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 13:02:28,729 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 13:02:28,729 EPOCH 8 done: loss 0.2214 - lr: 0.000007 |
|
2023-10-19 13:02:33,290 DEV : loss 0.15868689119815826 - f1-score (micro avg) 0.2516 |
|
2023-10-19 13:02:33,315 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 13:02:39,522 epoch 9 - iter 260/2606 - loss 0.23519503 - time (sec): 6.21 - samples/sec: 5990.34 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 13:02:45,723 epoch 9 - iter 520/2606 - loss 0.23340713 - time (sec): 12.41 - samples/sec: 5822.99 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 13:02:51,963 epoch 9 - iter 780/2606 - loss 0.22121362 - time (sec): 18.65 - samples/sec: 5914.69 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 13:02:57,756 epoch 9 - iter 1040/2606 - loss 0.22277727 - time (sec): 24.44 - samples/sec: 6011.45 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 13:03:03,526 epoch 9 - iter 1300/2606 - loss 0.21739257 - time (sec): 30.21 - samples/sec: 6042.24 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 13:03:09,980 epoch 9 - iter 1560/2606 - loss 0.21589958 - time (sec): 36.66 - samples/sec: 5941.02 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 13:03:16,633 epoch 9 - iter 1820/2606 - loss 0.21700838 - time (sec): 43.32 - samples/sec: 5906.06 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 13:03:22,806 epoch 9 - iter 2080/2606 - loss 0.21776357 - time (sec): 49.49 - samples/sec: 5885.89 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 13:03:28,910 epoch 9 - iter 2340/2606 - loss 0.21505972 - time (sec): 55.59 - samples/sec: 5894.57 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 13:03:35,753 epoch 9 - iter 2600/2606 - loss 0.21612136 - time (sec): 62.44 - samples/sec: 5872.69 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 13:03:35,906 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 13:03:35,906 EPOCH 9 done: loss 0.2160 - lr: 0.000003 |
|
2023-10-19 13:03:40,430 DEV : loss 0.1565387099981308 - f1-score (micro avg) 0.2539 |
|
2023-10-19 13:03:40,454 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 13:03:46,726 epoch 10 - iter 260/2606 - loss 0.17928057 - time (sec): 6.27 - samples/sec: 6050.14 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 13:03:52,865 epoch 10 - iter 520/2606 - loss 0.20216081 - time (sec): 12.41 - samples/sec: 5920.28 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 13:03:59,023 epoch 10 - iter 780/2606 - loss 0.21189348 - time (sec): 18.57 - samples/sec: 5847.83 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 13:04:05,082 epoch 10 - iter 1040/2606 - loss 0.21520943 - time (sec): 24.63 - samples/sec: 5915.94 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 13:04:11,292 epoch 10 - iter 1300/2606 - loss 0.21578914 - time (sec): 30.84 - samples/sec: 5992.28 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 13:04:17,267 epoch 10 - iter 1560/2606 - loss 0.21439979 - time (sec): 36.81 - samples/sec: 5916.03 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 13:04:23,393 epoch 10 - iter 1820/2606 - loss 0.21220142 - time (sec): 42.94 - samples/sec: 5978.77 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 13:04:29,500 epoch 10 - iter 2080/2606 - loss 0.21586521 - time (sec): 49.05 - samples/sec: 5928.12 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 13:04:35,766 epoch 10 - iter 2340/2606 - loss 0.21366698 - time (sec): 55.31 - samples/sec: 5944.29 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 13:04:42,097 epoch 10 - iter 2600/2606 - loss 0.21481396 - time (sec): 61.64 - samples/sec: 5949.21 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 13:04:42,236 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 13:04:42,236 EPOCH 10 done: loss 0.2146 - lr: 0.000000 |
|
2023-10-19 13:04:47,481 DEV : loss 0.1586076319217682 - f1-score (micro avg) 0.2592 |
|
2023-10-19 13:04:47,536 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 13:04:47,536 Loading model from best epoch ... |
|
2023-10-19 13:04:47,624 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-19 13:04:54,023 |
|
Results: |
|
- F-score (micro) 0.2533 |
|
- F-score (macro) 0.1366 |
|
- Accuracy 0.1461 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.4433 0.3542 0.3938 1214 |
|
PER 0.1609 0.1027 0.1254 808 |
|
ORG 0.0336 0.0227 0.0271 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.3022 0.2180 0.2533 2390 |
|
macro avg 0.1594 0.1199 0.1366 2390 |
|
weighted avg 0.2845 0.2180 0.2464 2390 |
|
|
|
2023-10-19 13:04:54,023 ---------------------------------------------------------------------------------------------------- |
|
|