|
2023-10-19 23:42:06,059 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:06,059 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-19 23:42:06,059 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:06,059 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-19 23:42:06,059 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:06,059 Train: 1166 sentences |
|
2023-10-19 23:42:06,059 (train_with_dev=False, train_with_test=False) |
|
2023-10-19 23:42:06,059 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:06,059 Training Params: |
|
2023-10-19 23:42:06,059 - learning_rate: "3e-05" |
|
2023-10-19 23:42:06,059 - mini_batch_size: "4" |
|
2023-10-19 23:42:06,059 - max_epochs: "10" |
|
2023-10-19 23:42:06,060 - shuffle: "True" |
|
2023-10-19 23:42:06,060 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:06,060 Plugins: |
|
2023-10-19 23:42:06,060 - TensorboardLogger |
|
2023-10-19 23:42:06,060 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-19 23:42:06,060 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:06,060 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-19 23:42:06,060 - metric: "('micro avg', 'f1-score')" |
|
2023-10-19 23:42:06,060 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:06,060 Computation: |
|
2023-10-19 23:42:06,060 - compute on device: cuda:0 |
|
2023-10-19 23:42:06,060 - embedding storage: none |
|
2023-10-19 23:42:06,060 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:06,060 Model training base path: "hmbench-newseye/fi-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-19 23:42:06,060 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:06,060 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:06,060 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-19 23:42:06,967 epoch 1 - iter 29/292 - loss 3.08591664 - time (sec): 0.91 - samples/sec: 5239.80 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:42:07,454 epoch 1 - iter 58/292 - loss 3.08024881 - time (sec): 1.39 - samples/sec: 6420.54 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:42:07,977 epoch 1 - iter 87/292 - loss 3.05119373 - time (sec): 1.92 - samples/sec: 7148.76 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 23:42:08,526 epoch 1 - iter 116/292 - loss 3.00830650 - time (sec): 2.47 - samples/sec: 7672.61 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:42:09,042 epoch 1 - iter 145/292 - loss 2.82533115 - time (sec): 2.98 - samples/sec: 7772.64 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:42:09,539 epoch 1 - iter 174/292 - loss 2.68196572 - time (sec): 3.48 - samples/sec: 7733.92 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:42:10,050 epoch 1 - iter 203/292 - loss 2.53889030 - time (sec): 3.99 - samples/sec: 7571.16 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:42:10,589 epoch 1 - iter 232/292 - loss 2.30235004 - time (sec): 4.53 - samples/sec: 7780.95 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:42:11,108 epoch 1 - iter 261/292 - loss 2.13326537 - time (sec): 5.05 - samples/sec: 7826.92 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:42:11,639 epoch 1 - iter 290/292 - loss 1.97964118 - time (sec): 5.58 - samples/sec: 7947.31 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 23:42:11,666 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:11,666 EPOCH 1 done: loss 1.9775 - lr: 0.000030 |
|
2023-10-19 23:42:12,081 DEV : loss 0.46280646324157715 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:42:12,085 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:12,612 epoch 2 - iter 29/292 - loss 0.75100566 - time (sec): 0.53 - samples/sec: 9942.53 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 23:42:13,156 epoch 2 - iter 58/292 - loss 0.79963907 - time (sec): 1.07 - samples/sec: 9250.93 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 23:42:13,658 epoch 2 - iter 87/292 - loss 0.77458000 - time (sec): 1.57 - samples/sec: 8957.69 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 23:42:14,177 epoch 2 - iter 116/292 - loss 0.73709213 - time (sec): 2.09 - samples/sec: 9063.02 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 23:42:14,696 epoch 2 - iter 145/292 - loss 0.73824341 - time (sec): 2.61 - samples/sec: 8719.67 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:42:15,239 epoch 2 - iter 174/292 - loss 0.70725517 - time (sec): 3.15 - samples/sec: 8521.16 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:42:15,758 epoch 2 - iter 203/292 - loss 0.69621852 - time (sec): 3.67 - samples/sec: 8545.85 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:42:16,275 epoch 2 - iter 232/292 - loss 0.68754116 - time (sec): 4.19 - samples/sec: 8578.64 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:42:16,722 epoch 2 - iter 261/292 - loss 0.67062185 - time (sec): 4.64 - samples/sec: 8662.35 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:42:17,188 epoch 2 - iter 290/292 - loss 0.65439760 - time (sec): 5.10 - samples/sec: 8685.95 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:42:17,214 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:17,214 EPOCH 2 done: loss 0.6545 - lr: 0.000027 |
|
2023-10-19 23:42:17,832 DEV : loss 0.4219829738140106 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:42:17,836 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:18,310 epoch 3 - iter 29/292 - loss 0.49006854 - time (sec): 0.47 - samples/sec: 9074.56 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:42:18,847 epoch 3 - iter 58/292 - loss 0.73232854 - time (sec): 1.01 - samples/sec: 8510.41 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:42:19,353 epoch 3 - iter 87/292 - loss 0.68251336 - time (sec): 1.52 - samples/sec: 8519.91 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:42:19,875 epoch 3 - iter 116/292 - loss 0.63235067 - time (sec): 2.04 - samples/sec: 8668.90 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:42:20,382 epoch 3 - iter 145/292 - loss 0.62468572 - time (sec): 2.55 - samples/sec: 8392.58 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:42:20,935 epoch 3 - iter 174/292 - loss 0.62606584 - time (sec): 3.10 - samples/sec: 8613.21 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:42:21,480 epoch 3 - iter 203/292 - loss 0.60290014 - time (sec): 3.64 - samples/sec: 8616.36 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:42:22,003 epoch 3 - iter 232/292 - loss 0.59361057 - time (sec): 4.17 - samples/sec: 8458.53 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:42:22,517 epoch 3 - iter 261/292 - loss 0.58612670 - time (sec): 4.68 - samples/sec: 8481.25 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:42:23,030 epoch 3 - iter 290/292 - loss 0.57753323 - time (sec): 5.19 - samples/sec: 8521.11 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:42:23,060 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:23,060 EPOCH 3 done: loss 0.5763 - lr: 0.000023 |
|
2023-10-19 23:42:23,827 DEV : loss 0.3529767692089081 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:42:23,831 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:24,353 epoch 4 - iter 29/292 - loss 0.49689288 - time (sec): 0.52 - samples/sec: 8422.11 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:42:24,894 epoch 4 - iter 58/292 - loss 0.48782282 - time (sec): 1.06 - samples/sec: 8283.51 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:42:25,397 epoch 4 - iter 87/292 - loss 0.47892406 - time (sec): 1.57 - samples/sec: 8598.67 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:42:25,913 epoch 4 - iter 116/292 - loss 0.47218105 - time (sec): 2.08 - samples/sec: 8481.60 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:42:26,418 epoch 4 - iter 145/292 - loss 0.47202192 - time (sec): 2.59 - samples/sec: 8533.59 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:42:26,959 epoch 4 - iter 174/292 - loss 0.49078431 - time (sec): 3.13 - samples/sec: 8760.33 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:42:27,471 epoch 4 - iter 203/292 - loss 0.49917888 - time (sec): 3.64 - samples/sec: 8606.24 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:42:27,985 epoch 4 - iter 232/292 - loss 0.49683508 - time (sec): 4.15 - samples/sec: 8581.77 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:42:28,526 epoch 4 - iter 261/292 - loss 0.49461759 - time (sec): 4.69 - samples/sec: 8607.84 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:42:29,018 epoch 4 - iter 290/292 - loss 0.48994028 - time (sec): 5.19 - samples/sec: 8493.63 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:42:29,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:29,054 EPOCH 4 done: loss 0.4878 - lr: 0.000020 |
|
2023-10-19 23:42:29,684 DEV : loss 0.33182382583618164 - f1-score (micro avg) 0.0519 |
|
2023-10-19 23:42:29,688 saving best model |
|
2023-10-19 23:42:29,718 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:30,243 epoch 5 - iter 29/292 - loss 0.47860587 - time (sec): 0.52 - samples/sec: 8301.65 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:42:30,758 epoch 5 - iter 58/292 - loss 0.45421070 - time (sec): 1.04 - samples/sec: 8889.91 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 23:42:31,321 epoch 5 - iter 87/292 - loss 0.50017143 - time (sec): 1.60 - samples/sec: 9120.81 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 23:42:31,851 epoch 5 - iter 116/292 - loss 0.50574198 - time (sec): 2.13 - samples/sec: 8859.29 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 23:42:32,349 epoch 5 - iter 145/292 - loss 0.48939797 - time (sec): 2.63 - samples/sec: 8699.48 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:42:32,872 epoch 5 - iter 174/292 - loss 0.47205527 - time (sec): 3.15 - samples/sec: 8561.62 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:42:33,426 epoch 5 - iter 203/292 - loss 0.47253502 - time (sec): 3.71 - samples/sec: 8455.14 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:42:33,921 epoch 5 - iter 232/292 - loss 0.46611011 - time (sec): 4.20 - samples/sec: 8486.53 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:42:34,435 epoch 5 - iter 261/292 - loss 0.45700445 - time (sec): 4.72 - samples/sec: 8402.90 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:42:34,991 epoch 5 - iter 290/292 - loss 0.44717647 - time (sec): 5.27 - samples/sec: 8382.89 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:42:35,027 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:35,027 EPOCH 5 done: loss 0.4469 - lr: 0.000017 |
|
2023-10-19 23:42:35,658 DEV : loss 0.329159677028656 - f1-score (micro avg) 0.11 |
|
2023-10-19 23:42:35,661 saving best model |
|
2023-10-19 23:42:35,693 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:36,237 epoch 6 - iter 29/292 - loss 0.45931813 - time (sec): 0.54 - samples/sec: 9314.72 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:42:36,754 epoch 6 - iter 58/292 - loss 0.42547849 - time (sec): 1.06 - samples/sec: 8689.77 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:42:37,317 epoch 6 - iter 87/292 - loss 0.47113410 - time (sec): 1.62 - samples/sec: 8916.09 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:42:37,839 epoch 6 - iter 116/292 - loss 0.46155777 - time (sec): 2.15 - samples/sec: 8628.72 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:42:38,402 epoch 6 - iter 145/292 - loss 0.45884388 - time (sec): 2.71 - samples/sec: 8457.57 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:42:38,917 epoch 6 - iter 174/292 - loss 0.44123585 - time (sec): 3.22 - samples/sec: 8597.19 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:42:39,456 epoch 6 - iter 203/292 - loss 0.42472976 - time (sec): 3.76 - samples/sec: 8436.29 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 23:42:39,973 epoch 6 - iter 232/292 - loss 0.42486810 - time (sec): 4.28 - samples/sec: 8330.37 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 23:42:40,492 epoch 6 - iter 261/292 - loss 0.42696152 - time (sec): 4.80 - samples/sec: 8374.98 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 23:42:40,995 epoch 6 - iter 290/292 - loss 0.42408929 - time (sec): 5.30 - samples/sec: 8364.21 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 23:42:41,020 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:41,020 EPOCH 6 done: loss 0.4244 - lr: 0.000013 |
|
2023-10-19 23:42:41,649 DEV : loss 0.32003137469291687 - f1-score (micro avg) 0.152 |
|
2023-10-19 23:42:41,652 saving best model |
|
2023-10-19 23:42:41,684 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:42,226 epoch 7 - iter 29/292 - loss 0.37862602 - time (sec): 0.54 - samples/sec: 9870.25 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 23:42:42,716 epoch 7 - iter 58/292 - loss 0.40722645 - time (sec): 1.03 - samples/sec: 9014.41 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 23:42:43,225 epoch 7 - iter 87/292 - loss 0.40708103 - time (sec): 1.54 - samples/sec: 8485.94 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:42:43,749 epoch 7 - iter 116/292 - loss 0.38567811 - time (sec): 2.06 - samples/sec: 8628.47 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:42:44,255 epoch 7 - iter 145/292 - loss 0.38065430 - time (sec): 2.57 - samples/sec: 8677.84 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:42:44,729 epoch 7 - iter 174/292 - loss 0.39464854 - time (sec): 3.04 - samples/sec: 8592.58 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 23:42:45,199 epoch 7 - iter 203/292 - loss 0.39435703 - time (sec): 3.51 - samples/sec: 8462.62 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 23:42:45,716 epoch 7 - iter 232/292 - loss 0.41885136 - time (sec): 4.03 - samples/sec: 8568.32 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 23:42:46,250 epoch 7 - iter 261/292 - loss 0.40635836 - time (sec): 4.56 - samples/sec: 8629.52 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:42:46,796 epoch 7 - iter 290/292 - loss 0.40642788 - time (sec): 5.11 - samples/sec: 8648.25 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:42:46,827 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:46,827 EPOCH 7 done: loss 0.4068 - lr: 0.000010 |
|
2023-10-19 23:42:47,470 DEV : loss 0.3101561963558197 - f1-score (micro avg) 0.1771 |
|
2023-10-19 23:42:47,474 saving best model |
|
2023-10-19 23:42:47,505 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:48,025 epoch 8 - iter 29/292 - loss 0.39692250 - time (sec): 0.52 - samples/sec: 8387.53 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:42:48,555 epoch 8 - iter 58/292 - loss 0.40211436 - time (sec): 1.05 - samples/sec: 8139.94 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 23:42:49,109 epoch 8 - iter 87/292 - loss 0.37946836 - time (sec): 1.60 - samples/sec: 8053.82 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 23:42:49,638 epoch 8 - iter 116/292 - loss 0.37436871 - time (sec): 2.13 - samples/sec: 8028.81 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 23:42:50,165 epoch 8 - iter 145/292 - loss 0.38320782 - time (sec): 2.66 - samples/sec: 8267.49 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 23:42:50,661 epoch 8 - iter 174/292 - loss 0.37239817 - time (sec): 3.16 - samples/sec: 8175.80 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 23:42:51,161 epoch 8 - iter 203/292 - loss 0.38670704 - time (sec): 3.66 - samples/sec: 8324.35 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 23:42:51,691 epoch 8 - iter 232/292 - loss 0.40156103 - time (sec): 4.19 - samples/sec: 8530.79 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 23:42:52,196 epoch 8 - iter 261/292 - loss 0.40115704 - time (sec): 4.69 - samples/sec: 8501.21 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 23:42:52,760 epoch 8 - iter 290/292 - loss 0.39705366 - time (sec): 5.25 - samples/sec: 8409.61 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 23:42:52,798 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:52,799 EPOCH 8 done: loss 0.3972 - lr: 0.000007 |
|
2023-10-19 23:42:53,436 DEV : loss 0.3071003556251526 - f1-score (micro avg) 0.1917 |
|
2023-10-19 23:42:53,440 saving best model |
|
2023-10-19 23:42:53,472 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:53,940 epoch 9 - iter 29/292 - loss 0.29746546 - time (sec): 0.47 - samples/sec: 10070.18 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:42:54,409 epoch 9 - iter 58/292 - loss 0.35506392 - time (sec): 0.94 - samples/sec: 9508.66 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:42:54,886 epoch 9 - iter 87/292 - loss 0.35396730 - time (sec): 1.41 - samples/sec: 9629.12 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:42:55,379 epoch 9 - iter 116/292 - loss 0.36343270 - time (sec): 1.91 - samples/sec: 9125.81 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:42:55,864 epoch 9 - iter 145/292 - loss 0.36432377 - time (sec): 2.39 - samples/sec: 9111.20 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:42:56,392 epoch 9 - iter 174/292 - loss 0.36168872 - time (sec): 2.92 - samples/sec: 9146.51 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:42:56,978 epoch 9 - iter 203/292 - loss 0.36858999 - time (sec): 3.51 - samples/sec: 9023.83 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 23:42:57,502 epoch 9 - iter 232/292 - loss 0.37232179 - time (sec): 4.03 - samples/sec: 8917.92 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 23:42:58,036 epoch 9 - iter 261/292 - loss 0.38336928 - time (sec): 4.56 - samples/sec: 8912.37 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 23:42:58,512 epoch 9 - iter 290/292 - loss 0.38695651 - time (sec): 5.04 - samples/sec: 8793.24 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:42:58,539 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:58,539 EPOCH 9 done: loss 0.3876 - lr: 0.000003 |
|
2023-10-19 23:42:59,308 DEV : loss 0.3116590082645416 - f1-score (micro avg) 0.1842 |
|
2023-10-19 23:42:59,312 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:42:59,844 epoch 10 - iter 29/292 - loss 0.28754311 - time (sec): 0.53 - samples/sec: 8273.26 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:43:00,349 epoch 10 - iter 58/292 - loss 0.32926331 - time (sec): 1.04 - samples/sec: 8670.32 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:43:00,849 epoch 10 - iter 87/292 - loss 0.36191922 - time (sec): 1.54 - samples/sec: 8343.15 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 23:43:01,357 epoch 10 - iter 116/292 - loss 0.38321652 - time (sec): 2.04 - samples/sec: 8290.88 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 23:43:01,851 epoch 10 - iter 145/292 - loss 0.40004916 - time (sec): 2.54 - samples/sec: 8267.04 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 23:43:02,355 epoch 10 - iter 174/292 - loss 0.38579499 - time (sec): 3.04 - samples/sec: 8332.74 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 23:43:02,866 epoch 10 - iter 203/292 - loss 0.37548323 - time (sec): 3.55 - samples/sec: 8364.88 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 23:43:03,398 epoch 10 - iter 232/292 - loss 0.37611261 - time (sec): 4.08 - samples/sec: 8522.01 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 23:43:03,915 epoch 10 - iter 261/292 - loss 0.37076893 - time (sec): 4.60 - samples/sec: 8506.28 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 23:43:04,459 epoch 10 - iter 290/292 - loss 0.39272161 - time (sec): 5.15 - samples/sec: 8606.84 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 23:43:04,488 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:04,488 EPOCH 10 done: loss 0.3923 - lr: 0.000000 |
|
2023-10-19 23:43:05,128 DEV : loss 0.3108135163784027 - f1-score (micro avg) 0.1927 |
|
2023-10-19 23:43:05,132 saving best model |
|
2023-10-19 23:43:05,190 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:05,190 Loading model from best epoch ... |
|
2023-10-19 23:43:05,263 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-19 23:43:06,146 |
|
Results: |
|
- F-score (micro) 0.2063 |
|
- F-score (macro) 0.1109 |
|
- Accuracy 0.1194 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.2196 0.1868 0.2019 348 |
|
LOC 0.2460 0.2375 0.2417 261 |
|
ORG 0.0000 0.0000 0.0000 52 |
|
HumanProd 0.0000 0.0000 0.0000 22 |
|
|
|
micro avg 0.2318 0.1859 0.2063 683 |
|
macro avg 0.1164 0.1061 0.1109 683 |
|
weighted avg 0.2059 0.1859 0.1952 683 |
|
|
|
2023-10-19 23:43:06,146 ---------------------------------------------------------------------------------------------------- |
|
|