|
2023-10-19 23:49:44,654 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:49:44,654 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-19 23:49:44,654 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:49:44,654 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-19 23:49:44,655 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:49:44,655 Train: 1166 sentences |
|
2023-10-19 23:49:44,655 (train_with_dev=False, train_with_test=False) |
|
2023-10-19 23:49:44,655 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:49:44,655 Training Params: |
|
2023-10-19 23:49:44,655 - learning_rate: "3e-05" |
|
2023-10-19 23:49:44,655 - mini_batch_size: "4" |
|
2023-10-19 23:49:44,655 - max_epochs: "10" |
|
2023-10-19 23:49:44,655 - shuffle: "True" |
|
2023-10-19 23:49:44,655 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:49:44,655 Plugins: |
|
2023-10-19 23:49:44,655 - TensorboardLogger |
|
2023-10-19 23:49:44,655 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-19 23:49:44,655 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:49:44,655 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-19 23:49:44,655 - metric: "('micro avg', 'f1-score')" |
|
2023-10-19 23:49:44,655 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:49:44,655 Computation: |
|
2023-10-19 23:49:44,655 - compute on device: cuda:0 |
|
2023-10-19 23:49:44,655 - embedding storage: none |
|
2023-10-19 23:49:44,655 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:49:44,655 Model training base path: "hmbench-newseye/fi-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-19 23:49:44,655 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:49:44,655 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:49:44,655 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-19 23:49:45,171 epoch 1 - iter 29/292 - loss 3.23366766 - time (sec): 0.52 - samples/sec: 9308.03 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:49:45,672 epoch 1 - iter 58/292 - loss 3.25275791 - time (sec): 1.02 - samples/sec: 8780.75 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:49:46,209 epoch 1 - iter 87/292 - loss 3.09522052 - time (sec): 1.55 - samples/sec: 8385.87 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 23:49:46,752 epoch 1 - iter 116/292 - loss 2.96850491 - time (sec): 2.10 - samples/sec: 8369.71 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:49:47,307 epoch 1 - iter 145/292 - loss 2.77725528 - time (sec): 2.65 - samples/sec: 8488.65 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:49:47,849 epoch 1 - iter 174/292 - loss 2.58029420 - time (sec): 3.19 - samples/sec: 8553.64 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:49:48,361 epoch 1 - iter 203/292 - loss 2.39676813 - time (sec): 3.70 - samples/sec: 8574.16 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:49:48,846 epoch 1 - iter 232/292 - loss 2.23157178 - time (sec): 4.19 - samples/sec: 8552.98 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:49:49,383 epoch 1 - iter 261/292 - loss 2.09117800 - time (sec): 4.73 - samples/sec: 8454.45 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:49:49,865 epoch 1 - iter 290/292 - loss 1.99194641 - time (sec): 5.21 - samples/sec: 8489.62 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 23:49:49,893 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:49:49,893 EPOCH 1 done: loss 1.9855 - lr: 0.000030 |
|
2023-10-19 23:49:50,154 DEV : loss 0.45264244079589844 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:49:50,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:49:50,701 epoch 2 - iter 29/292 - loss 0.92771541 - time (sec): 0.54 - samples/sec: 9419.62 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 23:49:51,190 epoch 2 - iter 58/292 - loss 0.79135239 - time (sec): 1.03 - samples/sec: 8802.41 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 23:49:51,708 epoch 2 - iter 87/292 - loss 0.76416654 - time (sec): 1.55 - samples/sec: 8865.95 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 23:49:52,232 epoch 2 - iter 116/292 - loss 0.75800533 - time (sec): 2.07 - samples/sec: 8888.41 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 23:49:52,749 epoch 2 - iter 145/292 - loss 0.74324513 - time (sec): 2.59 - samples/sec: 8725.33 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:49:53,239 epoch 2 - iter 174/292 - loss 0.72235129 - time (sec): 3.08 - samples/sec: 8666.19 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:49:53,750 epoch 2 - iter 203/292 - loss 0.68411345 - time (sec): 3.59 - samples/sec: 8684.86 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:49:54,296 epoch 2 - iter 232/292 - loss 0.66363694 - time (sec): 4.14 - samples/sec: 8702.86 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:49:54,805 epoch 2 - iter 261/292 - loss 0.65749594 - time (sec): 4.65 - samples/sec: 8595.78 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:49:55,316 epoch 2 - iter 290/292 - loss 0.64381828 - time (sec): 5.16 - samples/sec: 8537.64 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:49:55,353 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:49:55,353 EPOCH 2 done: loss 0.6399 - lr: 0.000027 |
|
2023-10-19 23:49:55,987 DEV : loss 0.4109416902065277 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:49:55,991 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:49:56,511 epoch 3 - iter 29/292 - loss 0.56836488 - time (sec): 0.52 - samples/sec: 8126.62 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:49:57,065 epoch 3 - iter 58/292 - loss 0.52701593 - time (sec): 1.07 - samples/sec: 8165.60 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:49:57,621 epoch 3 - iter 87/292 - loss 0.53413707 - time (sec): 1.63 - samples/sec: 8445.86 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:49:58,141 epoch 3 - iter 116/292 - loss 0.53704706 - time (sec): 2.15 - samples/sec: 8606.37 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:49:58,652 epoch 3 - iter 145/292 - loss 0.53492671 - time (sec): 2.66 - samples/sec: 8476.20 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:49:59,173 epoch 3 - iter 174/292 - loss 0.52807808 - time (sec): 3.18 - samples/sec: 8499.75 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:49:59,678 epoch 3 - iter 203/292 - loss 0.52293822 - time (sec): 3.69 - samples/sec: 8483.35 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:50:00,319 epoch 3 - iter 232/292 - loss 0.53248911 - time (sec): 4.33 - samples/sec: 8196.29 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:50:00,857 epoch 3 - iter 261/292 - loss 0.55584902 - time (sec): 4.87 - samples/sec: 8346.87 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:50:01,356 epoch 3 - iter 290/292 - loss 0.54349142 - time (sec): 5.36 - samples/sec: 8243.27 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:50:01,385 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:50:01,386 EPOCH 3 done: loss 0.5425 - lr: 0.000023 |
|
2023-10-19 23:50:02,033 DEV : loss 0.3629387617111206 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:50:02,037 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:50:02,583 epoch 4 - iter 29/292 - loss 0.44234018 - time (sec): 0.55 - samples/sec: 8145.45 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:50:03,112 epoch 4 - iter 58/292 - loss 0.47095915 - time (sec): 1.08 - samples/sec: 8028.67 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:50:03,606 epoch 4 - iter 87/292 - loss 0.48713525 - time (sec): 1.57 - samples/sec: 8091.97 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:50:04,084 epoch 4 - iter 116/292 - loss 0.54057413 - time (sec): 2.05 - samples/sec: 8646.72 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:50:04,597 epoch 4 - iter 145/292 - loss 0.54157699 - time (sec): 2.56 - samples/sec: 8471.16 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:50:05,106 epoch 4 - iter 174/292 - loss 0.52419907 - time (sec): 3.07 - samples/sec: 8467.18 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:50:05,603 epoch 4 - iter 203/292 - loss 0.50823803 - time (sec): 3.57 - samples/sec: 8393.88 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:50:06,145 epoch 4 - iter 232/292 - loss 0.49738370 - time (sec): 4.11 - samples/sec: 8590.14 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:50:06,650 epoch 4 - iter 261/292 - loss 0.48959920 - time (sec): 4.61 - samples/sec: 8553.94 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:50:07,166 epoch 4 - iter 290/292 - loss 0.48932254 - time (sec): 5.13 - samples/sec: 8567.77 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:50:07,205 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:50:07,205 EPOCH 4 done: loss 0.4862 - lr: 0.000020 |
|
2023-10-19 23:50:07,841 DEV : loss 0.3369694650173187 - f1-score (micro avg) 0.0522 |
|
2023-10-19 23:50:07,845 saving best model |
|
2023-10-19 23:50:07,873 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:50:08,389 epoch 5 - iter 29/292 - loss 0.59986532 - time (sec): 0.52 - samples/sec: 9669.90 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:50:08,890 epoch 5 - iter 58/292 - loss 0.54446958 - time (sec): 1.02 - samples/sec: 8851.61 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 23:50:09,415 epoch 5 - iter 87/292 - loss 0.52382245 - time (sec): 1.54 - samples/sec: 8780.25 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 23:50:09,958 epoch 5 - iter 116/292 - loss 0.49345204 - time (sec): 2.08 - samples/sec: 8805.88 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 23:50:10,521 epoch 5 - iter 145/292 - loss 0.46446324 - time (sec): 2.65 - samples/sec: 8647.69 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:50:10,998 epoch 5 - iter 174/292 - loss 0.45495738 - time (sec): 3.12 - samples/sec: 8585.51 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:50:11,506 epoch 5 - iter 203/292 - loss 0.44868989 - time (sec): 3.63 - samples/sec: 8651.19 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:50:12,002 epoch 5 - iter 232/292 - loss 0.45470579 - time (sec): 4.13 - samples/sec: 8429.60 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:50:12,488 epoch 5 - iter 261/292 - loss 0.45728633 - time (sec): 4.61 - samples/sec: 8503.52 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:50:12,998 epoch 5 - iter 290/292 - loss 0.45813778 - time (sec): 5.12 - samples/sec: 8626.62 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:50:13,028 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:50:13,029 EPOCH 5 done: loss 0.4589 - lr: 0.000017 |
|
2023-10-19 23:50:13,664 DEV : loss 0.3155231177806854 - f1-score (micro avg) 0.2 |
|
2023-10-19 23:50:13,668 saving best model |
|
2023-10-19 23:50:13,701 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:50:14,204 epoch 6 - iter 29/292 - loss 0.53462839 - time (sec): 0.50 - samples/sec: 8790.02 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:50:14,710 epoch 6 - iter 58/292 - loss 0.45387607 - time (sec): 1.01 - samples/sec: 8763.56 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:50:15,202 epoch 6 - iter 87/292 - loss 0.43858593 - time (sec): 1.50 - samples/sec: 8859.98 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:50:15,682 epoch 6 - iter 116/292 - loss 0.44130740 - time (sec): 1.98 - samples/sec: 8591.37 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:50:16,206 epoch 6 - iter 145/292 - loss 0.43534614 - time (sec): 2.50 - samples/sec: 8722.92 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:50:16,721 epoch 6 - iter 174/292 - loss 0.45146060 - time (sec): 3.02 - samples/sec: 8892.81 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:50:17,227 epoch 6 - iter 203/292 - loss 0.43994146 - time (sec): 3.53 - samples/sec: 8858.23 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 23:50:17,738 epoch 6 - iter 232/292 - loss 0.43069230 - time (sec): 4.04 - samples/sec: 8776.63 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 23:50:18,259 epoch 6 - iter 261/292 - loss 0.42140208 - time (sec): 4.56 - samples/sec: 8859.96 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 23:50:18,773 epoch 6 - iter 290/292 - loss 0.42268096 - time (sec): 5.07 - samples/sec: 8732.95 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 23:50:18,802 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:50:18,802 EPOCH 6 done: loss 0.4225 - lr: 0.000013 |
|
2023-10-19 23:50:19,452 DEV : loss 0.3091878592967987 - f1-score (micro avg) 0.2337 |
|
2023-10-19 23:50:19,457 saving best model |
|
2023-10-19 23:50:19,488 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:50:19,962 epoch 7 - iter 29/292 - loss 0.36939978 - time (sec): 0.47 - samples/sec: 8508.54 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 23:50:20,469 epoch 7 - iter 58/292 - loss 0.34863665 - time (sec): 0.98 - samples/sec: 8253.29 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 23:50:20,967 epoch 7 - iter 87/292 - loss 0.37735444 - time (sec): 1.48 - samples/sec: 8488.20 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:50:21,461 epoch 7 - iter 116/292 - loss 0.42584412 - time (sec): 1.97 - samples/sec: 8521.83 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:50:21,956 epoch 7 - iter 145/292 - loss 0.41926330 - time (sec): 2.47 - samples/sec: 8485.20 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:50:22,465 epoch 7 - iter 174/292 - loss 0.41470611 - time (sec): 2.98 - samples/sec: 8410.67 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 23:50:22,985 epoch 7 - iter 203/292 - loss 0.41112570 - time (sec): 3.50 - samples/sec: 8532.31 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 23:50:23,524 epoch 7 - iter 232/292 - loss 0.41695875 - time (sec): 4.04 - samples/sec: 8494.03 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 23:50:24,066 epoch 7 - iter 261/292 - loss 0.41595517 - time (sec): 4.58 - samples/sec: 8616.91 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:50:24,614 epoch 7 - iter 290/292 - loss 0.40657648 - time (sec): 5.13 - samples/sec: 8630.87 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:50:24,651 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:50:24,651 EPOCH 7 done: loss 0.4063 - lr: 0.000010 |
|
2023-10-19 23:50:25,299 DEV : loss 0.30544406175613403 - f1-score (micro avg) 0.2442 |
|
2023-10-19 23:50:25,304 saving best model |
|
2023-10-19 23:50:25,335 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:50:25,862 epoch 8 - iter 29/292 - loss 0.46693410 - time (sec): 0.53 - samples/sec: 9430.55 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:50:26,374 epoch 8 - iter 58/292 - loss 0.45223574 - time (sec): 1.04 - samples/sec: 8889.98 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 23:50:26,919 epoch 8 - iter 87/292 - loss 0.41985147 - time (sec): 1.58 - samples/sec: 8638.02 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 23:50:27,500 epoch 8 - iter 116/292 - loss 0.39639438 - time (sec): 2.16 - samples/sec: 8397.44 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 23:50:27,950 epoch 8 - iter 145/292 - loss 0.38044177 - time (sec): 2.61 - samples/sec: 8685.66 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 23:50:28,444 epoch 8 - iter 174/292 - loss 0.39141118 - time (sec): 3.11 - samples/sec: 8534.62 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 23:50:28,981 epoch 8 - iter 203/292 - loss 0.38177708 - time (sec): 3.65 - samples/sec: 8433.50 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 23:50:29,486 epoch 8 - iter 232/292 - loss 0.39149494 - time (sec): 4.15 - samples/sec: 8434.27 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 23:50:30,009 epoch 8 - iter 261/292 - loss 0.38697211 - time (sec): 4.67 - samples/sec: 8473.81 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 23:50:30,523 epoch 8 - iter 290/292 - loss 0.38508142 - time (sec): 5.19 - samples/sec: 8518.94 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 23:50:30,554 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:50:30,554 EPOCH 8 done: loss 0.3861 - lr: 0.000007 |
|
2023-10-19 23:50:31,188 DEV : loss 0.306864857673645 - f1-score (micro avg) 0.2274 |
|
2023-10-19 23:50:31,192 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:50:31,687 epoch 9 - iter 29/292 - loss 0.34080573 - time (sec): 0.49 - samples/sec: 8069.80 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:50:32,174 epoch 9 - iter 58/292 - loss 0.39740699 - time (sec): 0.98 - samples/sec: 8275.65 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:50:32,672 epoch 9 - iter 87/292 - loss 0.38941554 - time (sec): 1.48 - samples/sec: 8329.76 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:50:33,175 epoch 9 - iter 116/292 - loss 0.37411188 - time (sec): 1.98 - samples/sec: 8364.62 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:50:33,678 epoch 9 - iter 145/292 - loss 0.37705396 - time (sec): 2.49 - samples/sec: 8584.38 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:50:34,207 epoch 9 - iter 174/292 - loss 0.37556133 - time (sec): 3.01 - samples/sec: 8673.00 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:50:34,741 epoch 9 - iter 203/292 - loss 0.38081345 - time (sec): 3.55 - samples/sec: 8845.59 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 23:50:35,256 epoch 9 - iter 232/292 - loss 0.38515839 - time (sec): 4.06 - samples/sec: 8916.11 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 23:50:35,754 epoch 9 - iter 261/292 - loss 0.38318305 - time (sec): 4.56 - samples/sec: 8878.53 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 23:50:36,245 epoch 9 - iter 290/292 - loss 0.38427222 - time (sec): 5.05 - samples/sec: 8745.25 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:50:36,279 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:50:36,279 EPOCH 9 done: loss 0.3833 - lr: 0.000003 |
|
2023-10-19 23:50:37,074 DEV : loss 0.306318461894989 - f1-score (micro avg) 0.2234 |
|
2023-10-19 23:50:37,078 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:50:37,653 epoch 10 - iter 29/292 - loss 0.30956639 - time (sec): 0.58 - samples/sec: 8756.50 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:50:38,195 epoch 10 - iter 58/292 - loss 0.35087803 - time (sec): 1.12 - samples/sec: 8239.31 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:50:38,681 epoch 10 - iter 87/292 - loss 0.35546959 - time (sec): 1.60 - samples/sec: 8456.65 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 23:50:39,183 epoch 10 - iter 116/292 - loss 0.34859194 - time (sec): 2.10 - samples/sec: 8631.38 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 23:50:39,649 epoch 10 - iter 145/292 - loss 0.37253886 - time (sec): 2.57 - samples/sec: 8795.86 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 23:50:40,139 epoch 10 - iter 174/292 - loss 0.37750606 - time (sec): 3.06 - samples/sec: 8752.59 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 23:50:40,599 epoch 10 - iter 203/292 - loss 0.38037581 - time (sec): 3.52 - samples/sec: 8598.64 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 23:50:41,069 epoch 10 - iter 232/292 - loss 0.37226905 - time (sec): 3.99 - samples/sec: 8751.82 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 23:50:41,578 epoch 10 - iter 261/292 - loss 0.37712323 - time (sec): 4.50 - samples/sec: 8867.59 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 23:50:42,056 epoch 10 - iter 290/292 - loss 0.37760770 - time (sec): 4.98 - samples/sec: 8879.88 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 23:50:42,079 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:50:42,080 EPOCH 10 done: loss 0.3783 - lr: 0.000000 |
|
2023-10-19 23:50:42,723 DEV : loss 0.308378130197525 - f1-score (micro avg) 0.2222 |
|
2023-10-19 23:50:42,755 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:50:42,756 Loading model from best epoch ... |
|
2023-10-19 23:50:42,833 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-19 23:50:43,726 |
|
Results: |
|
- F-score (micro) 0.2325 |
|
- F-score (macro) 0.1217 |
|
- Accuracy 0.137 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.2661 0.2615 0.2638 348 |
|
LOC 0.2537 0.1992 0.2232 261 |
|
ORG 0.0000 0.0000 0.0000 52 |
|
HumanProd 0.0000 0.0000 0.0000 22 |
|
|
|
micro avg 0.2614 0.2094 0.2325 683 |
|
macro avg 0.1299 0.1152 0.1217 683 |
|
weighted avg 0.2325 0.2094 0.2197 683 |
|
|
|
2023-10-19 23:50:43,726 ---------------------------------------------------------------------------------------------------- |
|
|