|
2023-10-20 00:26:59,401 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:59,401 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-20 00:26:59,401 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:59,402 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-20 00:26:59,402 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:59,402 Train: 1085 sentences |
|
2023-10-20 00:26:59,402 (train_with_dev=False, train_with_test=False) |
|
2023-10-20 00:26:59,402 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:59,402 Training Params: |
|
2023-10-20 00:26:59,402 - learning_rate: "5e-05" |
|
2023-10-20 00:26:59,402 - mini_batch_size: "8" |
|
2023-10-20 00:26:59,402 - max_epochs: "10" |
|
2023-10-20 00:26:59,402 - shuffle: "True" |
|
2023-10-20 00:26:59,402 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:59,402 Plugins: |
|
2023-10-20 00:26:59,402 - TensorboardLogger |
|
2023-10-20 00:26:59,402 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-20 00:26:59,402 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:59,402 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-20 00:26:59,402 - metric: "('micro avg', 'f1-score')" |
|
2023-10-20 00:26:59,402 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:59,402 Computation: |
|
2023-10-20 00:26:59,402 - compute on device: cuda:0 |
|
2023-10-20 00:26:59,402 - embedding storage: none |
|
2023-10-20 00:26:59,402 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:59,402 Model training base path: "hmbench-newseye/sv-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-20 00:26:59,402 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:59,402 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:59,402 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-20 00:26:59,746 epoch 1 - iter 13/136 - loss 2.99174193 - time (sec): 0.34 - samples/sec: 15290.18 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:27:00,082 epoch 1 - iter 26/136 - loss 2.96413466 - time (sec): 0.68 - samples/sec: 14525.00 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:27:00,448 epoch 1 - iter 39/136 - loss 2.84053462 - time (sec): 1.05 - samples/sec: 13972.13 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:27:00,831 epoch 1 - iter 52/136 - loss 2.76852975 - time (sec): 1.43 - samples/sec: 13566.94 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-20 00:27:01,184 epoch 1 - iter 65/136 - loss 2.62124449 - time (sec): 1.78 - samples/sec: 13686.57 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:27:01,539 epoch 1 - iter 78/136 - loss 2.45694965 - time (sec): 2.14 - samples/sec: 13863.28 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 00:27:01,909 epoch 1 - iter 91/136 - loss 2.29121118 - time (sec): 2.51 - samples/sec: 14173.74 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-20 00:27:02,238 epoch 1 - iter 104/136 - loss 2.18437061 - time (sec): 2.84 - samples/sec: 13903.44 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-20 00:27:02,598 epoch 1 - iter 117/136 - loss 2.04645061 - time (sec): 3.20 - samples/sec: 13827.93 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-20 00:27:02,955 epoch 1 - iter 130/136 - loss 1.89109901 - time (sec): 3.55 - samples/sec: 13960.34 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-20 00:27:03,099 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:03,100 EPOCH 1 done: loss 1.8348 - lr: 0.000047 |
|
2023-10-20 00:27:03,368 DEV : loss 0.5015563368797302 - f1-score (micro avg) 0.0 |
|
2023-10-20 00:27:03,372 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:03,729 epoch 2 - iter 13/136 - loss 0.60857951 - time (sec): 0.36 - samples/sec: 16459.68 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-20 00:27:04,077 epoch 2 - iter 26/136 - loss 0.63793114 - time (sec): 0.70 - samples/sec: 14440.02 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-20 00:27:04,439 epoch 2 - iter 39/136 - loss 0.68791160 - time (sec): 1.07 - samples/sec: 13666.75 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-20 00:27:04,793 epoch 2 - iter 52/136 - loss 0.63859874 - time (sec): 1.42 - samples/sec: 13906.09 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-20 00:27:05,149 epoch 2 - iter 65/136 - loss 0.61668645 - time (sec): 1.78 - samples/sec: 14049.74 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-20 00:27:05,498 epoch 2 - iter 78/136 - loss 0.60331772 - time (sec): 2.13 - samples/sec: 13974.49 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-20 00:27:05,859 epoch 2 - iter 91/136 - loss 0.61615866 - time (sec): 2.49 - samples/sec: 14136.37 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-20 00:27:06,221 epoch 2 - iter 104/136 - loss 0.60741981 - time (sec): 2.85 - samples/sec: 13954.59 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-20 00:27:06,586 epoch 2 - iter 117/136 - loss 0.60342841 - time (sec): 3.21 - samples/sec: 14014.59 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-20 00:27:06,936 epoch 2 - iter 130/136 - loss 0.59755673 - time (sec): 3.56 - samples/sec: 13976.85 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-20 00:27:07,089 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:07,089 EPOCH 2 done: loss 0.5906 - lr: 0.000045 |
|
2023-10-20 00:27:08,025 DEV : loss 0.39024820923805237 - f1-score (micro avg) 0.0071 |
|
2023-10-20 00:27:08,029 saving best model |
|
2023-10-20 00:27:08,054 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:08,423 epoch 3 - iter 13/136 - loss 0.44911705 - time (sec): 0.37 - samples/sec: 14818.97 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-20 00:27:08,766 epoch 3 - iter 26/136 - loss 0.50074753 - time (sec): 0.71 - samples/sec: 14126.48 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-20 00:27:09,115 epoch 3 - iter 39/136 - loss 0.50283672 - time (sec): 1.06 - samples/sec: 13887.55 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-20 00:27:09,457 epoch 3 - iter 52/136 - loss 0.47622121 - time (sec): 1.40 - samples/sec: 13509.51 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-20 00:27:09,803 epoch 3 - iter 65/136 - loss 0.48752604 - time (sec): 1.75 - samples/sec: 13728.02 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-20 00:27:10,172 epoch 3 - iter 78/136 - loss 0.48621133 - time (sec): 2.12 - samples/sec: 14277.44 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-20 00:27:10,541 epoch 3 - iter 91/136 - loss 0.46968554 - time (sec): 2.49 - samples/sec: 14755.84 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-20 00:27:10,883 epoch 3 - iter 104/136 - loss 0.46674456 - time (sec): 2.83 - samples/sec: 14647.74 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-20 00:27:11,219 epoch 3 - iter 117/136 - loss 0.46645110 - time (sec): 3.16 - samples/sec: 14437.25 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-20 00:27:11,564 epoch 3 - iter 130/136 - loss 0.46824379 - time (sec): 3.51 - samples/sec: 14300.80 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-20 00:27:11,715 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:11,715 EPOCH 3 done: loss 0.4653 - lr: 0.000039 |
|
2023-10-20 00:27:12,469 DEV : loss 0.3429010510444641 - f1-score (micro avg) 0.0399 |
|
2023-10-20 00:27:12,473 saving best model |
|
2023-10-20 00:27:12,505 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:12,846 epoch 4 - iter 13/136 - loss 0.45667035 - time (sec): 0.34 - samples/sec: 12986.45 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-20 00:27:13,200 epoch 4 - iter 26/136 - loss 0.40057927 - time (sec): 0.69 - samples/sec: 13731.49 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-20 00:27:13,555 epoch 4 - iter 39/136 - loss 0.40562505 - time (sec): 1.05 - samples/sec: 13990.65 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-20 00:27:13,940 epoch 4 - iter 52/136 - loss 0.40273949 - time (sec): 1.43 - samples/sec: 14383.51 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-20 00:27:14,280 epoch 4 - iter 65/136 - loss 0.40903860 - time (sec): 1.77 - samples/sec: 14456.07 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-20 00:27:14,636 epoch 4 - iter 78/136 - loss 0.40452734 - time (sec): 2.13 - samples/sec: 14290.94 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-20 00:27:14,984 epoch 4 - iter 91/136 - loss 0.41457293 - time (sec): 2.48 - samples/sec: 14381.24 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-20 00:27:15,333 epoch 4 - iter 104/136 - loss 0.41519681 - time (sec): 2.83 - samples/sec: 14305.51 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-20 00:27:15,674 epoch 4 - iter 117/136 - loss 0.42095346 - time (sec): 3.17 - samples/sec: 14035.12 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-20 00:27:16,039 epoch 4 - iter 130/136 - loss 0.41867387 - time (sec): 3.53 - samples/sec: 14113.12 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-20 00:27:16,203 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:16,203 EPOCH 4 done: loss 0.4188 - lr: 0.000034 |
|
2023-10-20 00:27:16,953 DEV : loss 0.32259702682495117 - f1-score (micro avg) 0.0985 |
|
2023-10-20 00:27:16,957 saving best model |
|
2023-10-20 00:27:16,987 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:17,345 epoch 5 - iter 13/136 - loss 0.40845983 - time (sec): 0.36 - samples/sec: 14213.54 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-20 00:27:17,703 epoch 5 - iter 26/136 - loss 0.37364817 - time (sec): 0.72 - samples/sec: 13529.45 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-20 00:27:18,046 epoch 5 - iter 39/136 - loss 0.39076166 - time (sec): 1.06 - samples/sec: 13658.06 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-20 00:27:18,401 epoch 5 - iter 52/136 - loss 0.39259624 - time (sec): 1.41 - samples/sec: 13777.80 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-20 00:27:18,757 epoch 5 - iter 65/136 - loss 0.39848236 - time (sec): 1.77 - samples/sec: 14109.05 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-20 00:27:19,107 epoch 5 - iter 78/136 - loss 0.39252052 - time (sec): 2.12 - samples/sec: 13939.66 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-20 00:27:19,458 epoch 5 - iter 91/136 - loss 0.38452188 - time (sec): 2.47 - samples/sec: 14039.84 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-20 00:27:19,812 epoch 5 - iter 104/136 - loss 0.37804562 - time (sec): 2.82 - samples/sec: 14043.23 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-20 00:27:20,320 epoch 5 - iter 117/136 - loss 0.38959493 - time (sec): 3.33 - samples/sec: 13289.05 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-20 00:27:20,686 epoch 5 - iter 130/136 - loss 0.38953709 - time (sec): 3.70 - samples/sec: 13406.52 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 00:27:20,849 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:20,850 EPOCH 5 done: loss 0.3876 - lr: 0.000028 |
|
2023-10-20 00:27:21,634 DEV : loss 0.28990423679351807 - f1-score (micro avg) 0.2169 |
|
2023-10-20 00:27:21,638 saving best model |
|
2023-10-20 00:27:21,673 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:22,002 epoch 6 - iter 13/136 - loss 0.43011177 - time (sec): 0.33 - samples/sec: 13614.39 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-20 00:27:22,366 epoch 6 - iter 26/136 - loss 0.42935702 - time (sec): 0.69 - samples/sec: 13490.60 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-20 00:27:22,730 epoch 6 - iter 39/136 - loss 0.38279064 - time (sec): 1.06 - samples/sec: 14158.37 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 00:27:23,084 epoch 6 - iter 52/136 - loss 0.38870660 - time (sec): 1.41 - samples/sec: 14264.05 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 00:27:23,445 epoch 6 - iter 65/136 - loss 0.38485490 - time (sec): 1.77 - samples/sec: 14103.90 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-20 00:27:23,786 epoch 6 - iter 78/136 - loss 0.37129942 - time (sec): 2.11 - samples/sec: 14357.55 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-20 00:27:24,144 epoch 6 - iter 91/136 - loss 0.36016383 - time (sec): 2.47 - samples/sec: 14405.05 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:27:24,497 epoch 6 - iter 104/136 - loss 0.36838511 - time (sec): 2.82 - samples/sec: 14182.08 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:27:24,832 epoch 6 - iter 117/136 - loss 0.36617739 - time (sec): 3.16 - samples/sec: 14156.54 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-20 00:27:25,174 epoch 6 - iter 130/136 - loss 0.36864604 - time (sec): 3.50 - samples/sec: 14217.13 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-20 00:27:25,337 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:25,337 EPOCH 6 done: loss 0.3678 - lr: 0.000023 |
|
2023-10-20 00:27:26,103 DEV : loss 0.2851005494594574 - f1-score (micro avg) 0.2696 |
|
2023-10-20 00:27:26,107 saving best model |
|
2023-10-20 00:27:26,137 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:26,513 epoch 7 - iter 13/136 - loss 0.41319372 - time (sec): 0.37 - samples/sec: 14234.49 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-20 00:27:26,864 epoch 7 - iter 26/136 - loss 0.38918016 - time (sec): 0.73 - samples/sec: 14613.85 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 00:27:27,220 epoch 7 - iter 39/136 - loss 0.37782537 - time (sec): 1.08 - samples/sec: 14896.89 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 00:27:27,573 epoch 7 - iter 52/136 - loss 0.35882322 - time (sec): 1.43 - samples/sec: 14644.40 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-20 00:27:27,919 epoch 7 - iter 65/136 - loss 0.35751503 - time (sec): 1.78 - samples/sec: 14157.69 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-20 00:27:28,279 epoch 7 - iter 78/136 - loss 0.34501957 - time (sec): 2.14 - samples/sec: 14333.12 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-20 00:27:28,640 epoch 7 - iter 91/136 - loss 0.34745316 - time (sec): 2.50 - samples/sec: 14243.65 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-20 00:27:28,993 epoch 7 - iter 104/136 - loss 0.34338987 - time (sec): 2.85 - samples/sec: 14195.63 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 00:27:29,322 epoch 7 - iter 117/136 - loss 0.34562531 - time (sec): 3.18 - samples/sec: 14058.38 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 00:27:29,686 epoch 7 - iter 130/136 - loss 0.34425524 - time (sec): 3.55 - samples/sec: 14108.36 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-20 00:27:29,840 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:29,841 EPOCH 7 done: loss 0.3456 - lr: 0.000017 |
|
2023-10-20 00:27:30,605 DEV : loss 0.27339091897010803 - f1-score (micro avg) 0.39 |
|
2023-10-20 00:27:30,609 saving best model |
|
2023-10-20 00:27:30,639 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:31,001 epoch 8 - iter 13/136 - loss 0.26407522 - time (sec): 0.36 - samples/sec: 13674.20 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-20 00:27:31,521 epoch 8 - iter 26/136 - loss 0.31575481 - time (sec): 0.88 - samples/sec: 11408.81 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-20 00:27:31,884 epoch 8 - iter 39/136 - loss 0.33893523 - time (sec): 1.24 - samples/sec: 13306.81 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:27:32,233 epoch 8 - iter 52/136 - loss 0.33397543 - time (sec): 1.59 - samples/sec: 12976.12 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:27:32,585 epoch 8 - iter 65/136 - loss 0.34781438 - time (sec): 1.95 - samples/sec: 13406.32 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:27:32,938 epoch 8 - iter 78/136 - loss 0.35349291 - time (sec): 2.30 - samples/sec: 13593.25 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:27:33,320 epoch 8 - iter 91/136 - loss 0.34378778 - time (sec): 2.68 - samples/sec: 13781.68 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-20 00:27:33,656 epoch 8 - iter 104/136 - loss 0.34014186 - time (sec): 3.02 - samples/sec: 13587.33 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-20 00:27:34,006 epoch 8 - iter 117/136 - loss 0.34142943 - time (sec): 3.37 - samples/sec: 13532.46 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:27:34,342 epoch 8 - iter 130/136 - loss 0.34943882 - time (sec): 3.70 - samples/sec: 13511.12 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:27:34,492 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:34,492 EPOCH 8 done: loss 0.3475 - lr: 0.000012 |
|
2023-10-20 00:27:35,269 DEV : loss 0.27384471893310547 - f1-score (micro avg) 0.3753 |
|
2023-10-20 00:27:35,273 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:35,609 epoch 9 - iter 13/136 - loss 0.36272695 - time (sec): 0.34 - samples/sec: 14199.66 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:27:36,002 epoch 9 - iter 26/136 - loss 0.34985916 - time (sec): 0.73 - samples/sec: 14210.84 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-20 00:27:36,364 epoch 9 - iter 39/136 - loss 0.37371902 - time (sec): 1.09 - samples/sec: 14131.43 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-20 00:27:36,735 epoch 9 - iter 52/136 - loss 0.34718522 - time (sec): 1.46 - samples/sec: 14590.93 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:27:37,091 epoch 9 - iter 65/136 - loss 0.34096769 - time (sec): 1.82 - samples/sec: 14320.87 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:27:37,429 epoch 9 - iter 78/136 - loss 0.34713841 - time (sec): 2.16 - samples/sec: 13946.64 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:27:37,798 epoch 9 - iter 91/136 - loss 0.34101774 - time (sec): 2.52 - samples/sec: 14043.44 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:27:38,156 epoch 9 - iter 104/136 - loss 0.33583877 - time (sec): 2.88 - samples/sec: 14060.14 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-20 00:27:38,511 epoch 9 - iter 117/136 - loss 0.33839451 - time (sec): 3.24 - samples/sec: 14213.40 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-20 00:27:38,847 epoch 9 - iter 130/136 - loss 0.33797165 - time (sec): 3.57 - samples/sec: 14106.59 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:27:39,008 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:39,009 EPOCH 9 done: loss 0.3374 - lr: 0.000006 |
|
2023-10-20 00:27:39,775 DEV : loss 0.27189964056015015 - f1-score (micro avg) 0.4053 |
|
2023-10-20 00:27:39,779 saving best model |
|
2023-10-20 00:27:39,814 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:40,179 epoch 10 - iter 13/136 - loss 0.36346730 - time (sec): 0.37 - samples/sec: 15436.52 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 00:27:40,529 epoch 10 - iter 26/136 - loss 0.32886607 - time (sec): 0.71 - samples/sec: 14312.16 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 00:27:40,872 epoch 10 - iter 39/136 - loss 0.32462501 - time (sec): 1.06 - samples/sec: 14506.08 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:27:41,225 epoch 10 - iter 52/136 - loss 0.32252604 - time (sec): 1.41 - samples/sec: 14447.57 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:27:41,594 epoch 10 - iter 65/136 - loss 0.31124105 - time (sec): 1.78 - samples/sec: 14289.21 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:27:41,892 epoch 10 - iter 78/136 - loss 0.31555264 - time (sec): 2.08 - samples/sec: 14366.24 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:27:42,210 epoch 10 - iter 91/136 - loss 0.31866405 - time (sec): 2.40 - samples/sec: 14494.69 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 00:27:42,523 epoch 10 - iter 104/136 - loss 0.32646837 - time (sec): 2.71 - samples/sec: 14646.78 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 00:27:42,856 epoch 10 - iter 117/136 - loss 0.32086288 - time (sec): 3.04 - samples/sec: 14683.40 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 00:27:43,385 epoch 10 - iter 130/136 - loss 0.32547930 - time (sec): 3.57 - samples/sec: 13985.99 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-20 00:27:43,536 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:43,536 EPOCH 10 done: loss 0.3254 - lr: 0.000000 |
|
2023-10-20 00:27:44,298 DEV : loss 0.2700875699520111 - f1-score (micro avg) 0.4053 |
|
2023-10-20 00:27:44,328 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:27:44,328 Loading model from best epoch ... |
|
2023-10-20 00:27:44,402 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-20 00:27:45,224 |
|
Results: |
|
- F-score (micro) 0.3025 |
|
- F-score (macro) 0.1634 |
|
- Accuracy 0.1879 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.1911 0.2692 0.2236 208 |
|
LOC 0.5856 0.3397 0.4300 312 |
|
ORG 0.0000 0.0000 0.0000 55 |
|
HumanProd 0.0000 0.0000 0.0000 22 |
|
|
|
micro avg 0.3418 0.2714 0.3025 597 |
|
macro avg 0.1942 0.1522 0.1634 597 |
|
weighted avg 0.3727 0.2714 0.3026 597 |
|
|
|
2023-10-20 00:27:45,225 ---------------------------------------------------------------------------------------------------- |
|
|