|
2023-10-20 00:14:53,713 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:14:53,713 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-20 00:14:53,713 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:14:53,713 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-20 00:14:53,713 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:14:53,714 Train: 1085 sentences |
|
2023-10-20 00:14:53,714 (train_with_dev=False, train_with_test=False) |
|
2023-10-20 00:14:53,714 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:14:53,714 Training Params: |
|
2023-10-20 00:14:53,714 - learning_rate: "3e-05" |
|
2023-10-20 00:14:53,714 - mini_batch_size: "8" |
|
2023-10-20 00:14:53,714 - max_epochs: "10" |
|
2023-10-20 00:14:53,714 - shuffle: "True" |
|
2023-10-20 00:14:53,714 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:14:53,714 Plugins: |
|
2023-10-20 00:14:53,714 - TensorboardLogger |
|
2023-10-20 00:14:53,714 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-20 00:14:53,714 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:14:53,714 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-20 00:14:53,714 - metric: "('micro avg', 'f1-score')" |
|
2023-10-20 00:14:53,714 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:14:53,714 Computation: |
|
2023-10-20 00:14:53,714 - compute on device: cuda:0 |
|
2023-10-20 00:14:53,714 - embedding storage: none |
|
2023-10-20 00:14:53,714 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:14:53,714 Model training base path: "hmbench-newseye/sv-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-20 00:14:53,714 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:14:53,714 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:14:53,714 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-20 00:14:54,037 epoch 1 - iter 13/136 - loss 2.72152588 - time (sec): 0.32 - samples/sec: 14670.99 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:14:54,392 epoch 1 - iter 26/136 - loss 2.78487699 - time (sec): 0.68 - samples/sec: 15197.47 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:14:54,714 epoch 1 - iter 39/136 - loss 2.77142009 - time (sec): 1.00 - samples/sec: 13959.57 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:14:55,063 epoch 1 - iter 52/136 - loss 2.70481600 - time (sec): 1.35 - samples/sec: 13540.38 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:14:55,437 epoch 1 - iter 65/136 - loss 2.61063044 - time (sec): 1.72 - samples/sec: 13659.15 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:14:55,789 epoch 1 - iter 78/136 - loss 2.51998300 - time (sec): 2.07 - samples/sec: 13812.47 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-20 00:14:56,150 epoch 1 - iter 91/136 - loss 2.40568426 - time (sec): 2.44 - samples/sec: 14092.89 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-20 00:14:56,497 epoch 1 - iter 104/136 - loss 2.29489111 - time (sec): 2.78 - samples/sec: 14204.10 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-20 00:14:56,871 epoch 1 - iter 117/136 - loss 2.14436655 - time (sec): 3.16 - samples/sec: 14496.12 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 00:14:57,217 epoch 1 - iter 130/136 - loss 2.03866895 - time (sec): 3.50 - samples/sec: 14314.43 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 00:14:57,370 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:14:57,370 EPOCH 1 done: loss 2.0027 - lr: 0.000028 |
|
2023-10-20 00:14:57,644 DEV : loss 0.5147674083709717 - f1-score (micro avg) 0.0 |
|
2023-10-20 00:14:57,647 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:14:57,998 epoch 2 - iter 13/136 - loss 0.76075803 - time (sec): 0.35 - samples/sec: 13093.46 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-20 00:14:58,358 epoch 2 - iter 26/136 - loss 0.68474176 - time (sec): 0.71 - samples/sec: 13609.77 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-20 00:14:58,720 epoch 2 - iter 39/136 - loss 0.67307661 - time (sec): 1.07 - samples/sec: 14304.35 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-20 00:14:59,064 epoch 2 - iter 52/136 - loss 0.67974143 - time (sec): 1.42 - samples/sec: 14231.98 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-20 00:14:59,438 epoch 2 - iter 65/136 - loss 0.66999883 - time (sec): 1.79 - samples/sec: 14118.85 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 00:14:59,787 epoch 2 - iter 78/136 - loss 0.64717688 - time (sec): 2.14 - samples/sec: 14069.74 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 00:15:00,127 epoch 2 - iter 91/136 - loss 0.64394003 - time (sec): 2.48 - samples/sec: 14098.17 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 00:15:00,476 epoch 2 - iter 104/136 - loss 0.64957491 - time (sec): 2.83 - samples/sec: 14269.41 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-20 00:15:00,819 epoch 2 - iter 117/136 - loss 0.65572779 - time (sec): 3.17 - samples/sec: 14018.30 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-20 00:15:01,171 epoch 2 - iter 130/136 - loss 0.64649843 - time (sec): 3.52 - samples/sec: 14109.96 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-20 00:15:01,334 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:01,334 EPOCH 2 done: loss 0.6472 - lr: 0.000027 |
|
2023-10-20 00:15:02,256 DEV : loss 0.4484618604183197 - f1-score (micro avg) 0.0 |
|
2023-10-20 00:15:02,261 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:02,614 epoch 3 - iter 13/136 - loss 0.52962342 - time (sec): 0.35 - samples/sec: 13201.03 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 00:15:02,957 epoch 3 - iter 26/136 - loss 0.57354178 - time (sec): 0.70 - samples/sec: 12403.69 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 00:15:03,296 epoch 3 - iter 39/136 - loss 0.59345434 - time (sec): 1.03 - samples/sec: 12821.34 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 00:15:03,639 epoch 3 - iter 52/136 - loss 0.56801542 - time (sec): 1.38 - samples/sec: 13196.90 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-20 00:15:03,996 epoch 3 - iter 65/136 - loss 0.56542056 - time (sec): 1.73 - samples/sec: 13651.98 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-20 00:15:04,353 epoch 3 - iter 78/136 - loss 0.57839887 - time (sec): 2.09 - samples/sec: 13666.00 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-20 00:15:04,710 epoch 3 - iter 91/136 - loss 0.57167273 - time (sec): 2.45 - samples/sec: 13820.74 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:15:05,077 epoch 3 - iter 104/136 - loss 0.56833936 - time (sec): 2.82 - samples/sec: 14255.05 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:15:05,443 epoch 3 - iter 117/136 - loss 0.55811131 - time (sec): 3.18 - samples/sec: 14123.93 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:15:05,795 epoch 3 - iter 130/136 - loss 0.55001629 - time (sec): 3.53 - samples/sec: 14209.42 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:15:05,939 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:05,939 EPOCH 3 done: loss 0.5457 - lr: 0.000024 |
|
2023-10-20 00:15:06,696 DEV : loss 0.38889962434768677 - f1-score (micro avg) 0.0 |
|
2023-10-20 00:15:06,700 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:07,059 epoch 4 - iter 13/136 - loss 0.50154065 - time (sec): 0.36 - samples/sec: 14223.36 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-20 00:15:07,433 epoch 4 - iter 26/136 - loss 0.50606387 - time (sec): 0.73 - samples/sec: 14296.91 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-20 00:15:07,798 epoch 4 - iter 39/136 - loss 0.49365564 - time (sec): 1.10 - samples/sec: 14267.18 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-20 00:15:08,136 epoch 4 - iter 52/136 - loss 0.49818870 - time (sec): 1.44 - samples/sec: 13973.64 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-20 00:15:08,477 epoch 4 - iter 65/136 - loss 0.50072732 - time (sec): 1.78 - samples/sec: 13848.54 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-20 00:15:08,829 epoch 4 - iter 78/136 - loss 0.50225318 - time (sec): 2.13 - samples/sec: 13777.37 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 00:15:09,189 epoch 4 - iter 91/136 - loss 0.49964287 - time (sec): 2.49 - samples/sec: 13999.68 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 00:15:09,563 epoch 4 - iter 104/136 - loss 0.48388305 - time (sec): 2.86 - samples/sec: 14128.12 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 00:15:09,899 epoch 4 - iter 117/136 - loss 0.48686068 - time (sec): 3.20 - samples/sec: 14056.48 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 00:15:10,249 epoch 4 - iter 130/136 - loss 0.49113062 - time (sec): 3.55 - samples/sec: 14245.09 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-20 00:15:10,390 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:10,390 EPOCH 4 done: loss 0.4923 - lr: 0.000020 |
|
2023-10-20 00:15:11,161 DEV : loss 0.36093470454216003 - f1-score (micro avg) 0.0138 |
|
2023-10-20 00:15:11,165 saving best model |
|
2023-10-20 00:15:11,191 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:11,535 epoch 5 - iter 13/136 - loss 0.48685833 - time (sec): 0.34 - samples/sec: 11961.22 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-20 00:15:11,874 epoch 5 - iter 26/136 - loss 0.48360129 - time (sec): 0.68 - samples/sec: 13230.05 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-20 00:15:12,217 epoch 5 - iter 39/136 - loss 0.46610858 - time (sec): 1.03 - samples/sec: 13717.31 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-20 00:15:12,525 epoch 5 - iter 52/136 - loss 0.45470638 - time (sec): 1.33 - samples/sec: 14260.50 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-20 00:15:12,824 epoch 5 - iter 65/136 - loss 0.46582992 - time (sec): 1.63 - samples/sec: 15190.07 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 00:15:13,106 epoch 5 - iter 78/136 - loss 0.47319200 - time (sec): 1.91 - samples/sec: 15193.34 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 00:15:13,405 epoch 5 - iter 91/136 - loss 0.47193998 - time (sec): 2.21 - samples/sec: 15268.23 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 00:15:13,875 epoch 5 - iter 104/136 - loss 0.47051975 - time (sec): 2.68 - samples/sec: 14771.05 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 00:15:14,181 epoch 5 - iter 117/136 - loss 0.46638781 - time (sec): 2.99 - samples/sec: 14963.65 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-20 00:15:14,471 epoch 5 - iter 130/136 - loss 0.46766506 - time (sec): 3.28 - samples/sec: 15010.97 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-20 00:15:14,621 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:14,621 EPOCH 5 done: loss 0.4669 - lr: 0.000017 |
|
2023-10-20 00:15:15,381 DEV : loss 0.33118170499801636 - f1-score (micro avg) 0.0583 |
|
2023-10-20 00:15:15,385 saving best model |
|
2023-10-20 00:15:15,415 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:15,713 epoch 6 - iter 13/136 - loss 0.42746417 - time (sec): 0.30 - samples/sec: 17301.41 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-20 00:15:16,005 epoch 6 - iter 26/136 - loss 0.40067902 - time (sec): 0.59 - samples/sec: 17521.70 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-20 00:15:16,294 epoch 6 - iter 39/136 - loss 0.39703205 - time (sec): 0.88 - samples/sec: 17066.06 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-20 00:15:16,598 epoch 6 - iter 52/136 - loss 0.41625900 - time (sec): 1.18 - samples/sec: 16871.78 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:15:16,918 epoch 6 - iter 65/136 - loss 0.43477326 - time (sec): 1.50 - samples/sec: 16945.81 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:15:17,231 epoch 6 - iter 78/136 - loss 0.44587471 - time (sec): 1.82 - samples/sec: 16725.04 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:15:17,526 epoch 6 - iter 91/136 - loss 0.44234516 - time (sec): 2.11 - samples/sec: 16608.76 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:15:17,820 epoch 6 - iter 104/136 - loss 0.43939936 - time (sec): 2.40 - samples/sec: 16605.70 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:15:18,115 epoch 6 - iter 117/136 - loss 0.43863542 - time (sec): 2.70 - samples/sec: 16716.66 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:15:18,398 epoch 6 - iter 130/136 - loss 0.43744152 - time (sec): 2.98 - samples/sec: 16595.78 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:15:18,540 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:18,540 EPOCH 6 done: loss 0.4393 - lr: 0.000014 |
|
2023-10-20 00:15:19,304 DEV : loss 0.3201183080673218 - f1-score (micro avg) 0.0688 |
|
2023-10-20 00:15:19,308 saving best model |
|
2023-10-20 00:15:19,338 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:19,640 epoch 7 - iter 13/136 - loss 0.47262970 - time (sec): 0.30 - samples/sec: 16933.74 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-20 00:15:19,941 epoch 7 - iter 26/136 - loss 0.46150833 - time (sec): 0.60 - samples/sec: 17857.65 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-20 00:15:20,217 epoch 7 - iter 39/136 - loss 0.45981501 - time (sec): 0.88 - samples/sec: 16783.72 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:15:20,507 epoch 7 - iter 52/136 - loss 0.45397364 - time (sec): 1.17 - samples/sec: 16965.94 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:15:20,814 epoch 7 - iter 65/136 - loss 0.43705463 - time (sec): 1.47 - samples/sec: 17092.66 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:15:21,105 epoch 7 - iter 78/136 - loss 0.42748089 - time (sec): 1.77 - samples/sec: 16888.56 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:15:21,421 epoch 7 - iter 91/136 - loss 0.42287302 - time (sec): 2.08 - samples/sec: 16908.57 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:15:21,729 epoch 7 - iter 104/136 - loss 0.42334313 - time (sec): 2.39 - samples/sec: 16828.07 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:15:22,020 epoch 7 - iter 117/136 - loss 0.42227118 - time (sec): 2.68 - samples/sec: 16714.39 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:15:22,309 epoch 7 - iter 130/136 - loss 0.42414636 - time (sec): 2.97 - samples/sec: 16698.50 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-20 00:15:22,453 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:22,453 EPOCH 7 done: loss 0.4222 - lr: 0.000010 |
|
2023-10-20 00:15:23,228 DEV : loss 0.3113357424736023 - f1-score (micro avg) 0.1068 |
|
2023-10-20 00:15:23,232 saving best model |
|
2023-10-20 00:15:23,264 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:23,593 epoch 8 - iter 13/136 - loss 0.34479809 - time (sec): 0.33 - samples/sec: 17802.61 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-20 00:15:23,908 epoch 8 - iter 26/136 - loss 0.39009149 - time (sec): 0.64 - samples/sec: 16766.92 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:15:24,387 epoch 8 - iter 39/136 - loss 0.40721604 - time (sec): 1.12 - samples/sec: 13488.55 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:15:24,735 epoch 8 - iter 52/136 - loss 0.38471006 - time (sec): 1.47 - samples/sec: 14020.41 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:15:25,083 epoch 8 - iter 65/136 - loss 0.39657642 - time (sec): 1.82 - samples/sec: 14084.04 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:15:25,441 epoch 8 - iter 78/136 - loss 0.39140758 - time (sec): 2.18 - samples/sec: 13946.80 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:15:25,803 epoch 8 - iter 91/136 - loss 0.39569377 - time (sec): 2.54 - samples/sec: 13747.17 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:15:26,169 epoch 8 - iter 104/136 - loss 0.39722167 - time (sec): 2.90 - samples/sec: 13769.71 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:15:26,507 epoch 8 - iter 117/136 - loss 0.40225322 - time (sec): 3.24 - samples/sec: 13804.68 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-20 00:15:26,850 epoch 8 - iter 130/136 - loss 0.40087826 - time (sec): 3.59 - samples/sec: 13697.91 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-20 00:15:27,013 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:27,014 EPOCH 8 done: loss 0.4033 - lr: 0.000007 |
|
2023-10-20 00:15:27,801 DEV : loss 0.306252121925354 - f1-score (micro avg) 0.1214 |
|
2023-10-20 00:15:27,806 saving best model |
|
2023-10-20 00:15:27,837 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:28,208 epoch 9 - iter 13/136 - loss 0.37030615 - time (sec): 0.37 - samples/sec: 13553.37 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:15:28,588 epoch 9 - iter 26/136 - loss 0.41854927 - time (sec): 0.75 - samples/sec: 13488.88 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:15:28,952 epoch 9 - iter 39/136 - loss 0.43664378 - time (sec): 1.11 - samples/sec: 13185.13 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:15:29,344 epoch 9 - iter 52/136 - loss 0.41430370 - time (sec): 1.51 - samples/sec: 13793.08 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:15:29,711 epoch 9 - iter 65/136 - loss 0.40126303 - time (sec): 1.87 - samples/sec: 13950.75 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 00:15:30,073 epoch 9 - iter 78/136 - loss 0.40287344 - time (sec): 2.24 - samples/sec: 13858.82 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 00:15:30,423 epoch 9 - iter 91/136 - loss 0.40298450 - time (sec): 2.59 - samples/sec: 13699.40 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 00:15:30,761 epoch 9 - iter 104/136 - loss 0.40534819 - time (sec): 2.92 - samples/sec: 13763.92 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:15:31,103 epoch 9 - iter 117/136 - loss 0.40660709 - time (sec): 3.27 - samples/sec: 13741.73 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:15:31,430 epoch 9 - iter 130/136 - loss 0.40302480 - time (sec): 3.59 - samples/sec: 13749.44 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:15:31,600 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:31,600 EPOCH 9 done: loss 0.4046 - lr: 0.000004 |
|
2023-10-20 00:15:32,365 DEV : loss 0.3038671314716339 - f1-score (micro avg) 0.142 |
|
2023-10-20 00:15:32,369 saving best model |
|
2023-10-20 00:15:32,402 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:32,754 epoch 10 - iter 13/136 - loss 0.46654899 - time (sec): 0.35 - samples/sec: 13147.01 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:15:33,126 epoch 10 - iter 26/136 - loss 0.39348560 - time (sec): 0.72 - samples/sec: 13843.17 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:15:33,494 epoch 10 - iter 39/136 - loss 0.37965230 - time (sec): 1.09 - samples/sec: 13304.68 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:15:33,857 epoch 10 - iter 52/136 - loss 0.40319403 - time (sec): 1.45 - samples/sec: 13658.42 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 00:15:34,187 epoch 10 - iter 65/136 - loss 0.38310015 - time (sec): 1.78 - samples/sec: 13803.09 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 00:15:34,536 epoch 10 - iter 78/136 - loss 0.39158606 - time (sec): 2.13 - samples/sec: 13825.33 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 00:15:34,902 epoch 10 - iter 91/136 - loss 0.39702863 - time (sec): 2.50 - samples/sec: 13852.96 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 00:15:35,254 epoch 10 - iter 104/136 - loss 0.40694497 - time (sec): 2.85 - samples/sec: 13624.47 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 00:15:35,628 epoch 10 - iter 117/136 - loss 0.39072910 - time (sec): 3.23 - samples/sec: 14022.56 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 00:15:36,110 epoch 10 - iter 130/136 - loss 0.39313748 - time (sec): 3.71 - samples/sec: 13420.73 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-20 00:15:36,265 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:36,265 EPOCH 10 done: loss 0.3934 - lr: 0.000000 |
|
2023-10-20 00:15:37,034 DEV : loss 0.3026784360408783 - f1-score (micro avg) 0.1521 |
|
2023-10-20 00:15:37,038 saving best model |
|
2023-10-20 00:15:37,094 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:15:37,094 Loading model from best epoch ... |
|
2023-10-20 00:15:37,164 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-20 00:15:37,989 |
|
Results: |
|
- F-score (micro) 0.1433 |
|
- F-score (macro) 0.0781 |
|
- Accuracy 0.0802 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.1779 0.1779 0.1779 208 |
|
LOC 0.3514 0.0833 0.1347 312 |
|
ORG 0.0000 0.0000 0.0000 55 |
|
HumanProd 0.0000 0.0000 0.0000 22 |
|
|
|
micro avg 0.2234 0.1055 0.1433 597 |
|
macro avg 0.1323 0.0653 0.0781 597 |
|
weighted avg 0.2456 0.1055 0.1324 597 |
|
|
|
2023-10-20 00:15:37,990 ---------------------------------------------------------------------------------------------------- |
|
|