stefan-it's picture
Upload folder using huggingface_hub
f08205b
2023-10-17 14:10:11,532 ----------------------------------------------------------------------------------------------------
2023-10-17 14:10:11,533 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 14:10:11,533 ----------------------------------------------------------------------------------------------------
2023-10-17 14:10:11,533 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-17 14:10:11,533 ----------------------------------------------------------------------------------------------------
2023-10-17 14:10:11,533 Train: 7142 sentences
2023-10-17 14:10:11,533 (train_with_dev=False, train_with_test=False)
2023-10-17 14:10:11,533 ----------------------------------------------------------------------------------------------------
2023-10-17 14:10:11,534 Training Params:
2023-10-17 14:10:11,534 - learning_rate: "3e-05"
2023-10-17 14:10:11,534 - mini_batch_size: "8"
2023-10-17 14:10:11,534 - max_epochs: "10"
2023-10-17 14:10:11,534 - shuffle: "True"
2023-10-17 14:10:11,534 ----------------------------------------------------------------------------------------------------
2023-10-17 14:10:11,534 Plugins:
2023-10-17 14:10:11,534 - TensorboardLogger
2023-10-17 14:10:11,534 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 14:10:11,534 ----------------------------------------------------------------------------------------------------
2023-10-17 14:10:11,534 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 14:10:11,534 - metric: "('micro avg', 'f1-score')"
2023-10-17 14:10:11,534 ----------------------------------------------------------------------------------------------------
2023-10-17 14:10:11,534 Computation:
2023-10-17 14:10:11,534 - compute on device: cuda:0
2023-10-17 14:10:11,534 - embedding storage: none
2023-10-17 14:10:11,534 ----------------------------------------------------------------------------------------------------
2023-10-17 14:10:11,534 Model training base path: "hmbench-newseye/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-17 14:10:11,534 ----------------------------------------------------------------------------------------------------
2023-10-17 14:10:11,534 ----------------------------------------------------------------------------------------------------
2023-10-17 14:10:11,534 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 14:10:18,101 epoch 1 - iter 89/893 - loss 3.18821826 - time (sec): 6.57 - samples/sec: 3663.50 - lr: 0.000003 - momentum: 0.000000
2023-10-17 14:10:24,900 epoch 1 - iter 178/893 - loss 2.02730331 - time (sec): 13.37 - samples/sec: 3655.29 - lr: 0.000006 - momentum: 0.000000
2023-10-17 14:10:31,279 epoch 1 - iter 267/893 - loss 1.54250315 - time (sec): 19.74 - samples/sec: 3620.84 - lr: 0.000009 - momentum: 0.000000
2023-10-17 14:10:38,176 epoch 1 - iter 356/893 - loss 1.22936425 - time (sec): 26.64 - samples/sec: 3648.31 - lr: 0.000012 - momentum: 0.000000
2023-10-17 14:10:45,399 epoch 1 - iter 445/893 - loss 1.03366668 - time (sec): 33.86 - samples/sec: 3618.90 - lr: 0.000015 - momentum: 0.000000
2023-10-17 14:10:52,409 epoch 1 - iter 534/893 - loss 0.90453755 - time (sec): 40.87 - samples/sec: 3593.88 - lr: 0.000018 - momentum: 0.000000
2023-10-17 14:10:59,662 epoch 1 - iter 623/893 - loss 0.79689601 - time (sec): 48.13 - samples/sec: 3580.57 - lr: 0.000021 - momentum: 0.000000
2023-10-17 14:11:06,903 epoch 1 - iter 712/893 - loss 0.71218918 - time (sec): 55.37 - samples/sec: 3595.84 - lr: 0.000024 - momentum: 0.000000
2023-10-17 14:11:13,804 epoch 1 - iter 801/893 - loss 0.65097764 - time (sec): 62.27 - samples/sec: 3605.59 - lr: 0.000027 - momentum: 0.000000
2023-10-17 14:11:20,259 epoch 1 - iter 890/893 - loss 0.60396316 - time (sec): 68.72 - samples/sec: 3607.87 - lr: 0.000030 - momentum: 0.000000
2023-10-17 14:11:20,494 ----------------------------------------------------------------------------------------------------
2023-10-17 14:11:20,494 EPOCH 1 done: loss 0.6024 - lr: 0.000030
2023-10-17 14:11:23,692 DEV : loss 0.10485294461250305 - f1-score (micro avg) 0.7213
2023-10-17 14:11:23,708 saving best model
2023-10-17 14:11:24,047 ----------------------------------------------------------------------------------------------------
2023-10-17 14:11:30,654 epoch 2 - iter 89/893 - loss 0.12053920 - time (sec): 6.61 - samples/sec: 3623.05 - lr: 0.000030 - momentum: 0.000000
2023-10-17 14:11:37,814 epoch 2 - iter 178/893 - loss 0.12089620 - time (sec): 13.77 - samples/sec: 3633.53 - lr: 0.000029 - momentum: 0.000000
2023-10-17 14:11:44,955 epoch 2 - iter 267/893 - loss 0.11493666 - time (sec): 20.91 - samples/sec: 3580.12 - lr: 0.000029 - momentum: 0.000000
2023-10-17 14:11:51,723 epoch 2 - iter 356/893 - loss 0.11530044 - time (sec): 27.67 - samples/sec: 3604.47 - lr: 0.000029 - momentum: 0.000000
2023-10-17 14:11:58,309 epoch 2 - iter 445/893 - loss 0.11276770 - time (sec): 34.26 - samples/sec: 3603.94 - lr: 0.000028 - momentum: 0.000000
2023-10-17 14:12:05,445 epoch 2 - iter 534/893 - loss 0.10981634 - time (sec): 41.40 - samples/sec: 3593.09 - lr: 0.000028 - momentum: 0.000000
2023-10-17 14:12:12,466 epoch 2 - iter 623/893 - loss 0.11043125 - time (sec): 48.42 - samples/sec: 3564.81 - lr: 0.000028 - momentum: 0.000000
2023-10-17 14:12:19,698 epoch 2 - iter 712/893 - loss 0.10651423 - time (sec): 55.65 - samples/sec: 3564.13 - lr: 0.000027 - momentum: 0.000000
2023-10-17 14:12:26,705 epoch 2 - iter 801/893 - loss 0.10556302 - time (sec): 62.66 - samples/sec: 3571.07 - lr: 0.000027 - momentum: 0.000000
2023-10-17 14:12:33,493 epoch 2 - iter 890/893 - loss 0.10507616 - time (sec): 69.44 - samples/sec: 3571.62 - lr: 0.000027 - momentum: 0.000000
2023-10-17 14:12:33,703 ----------------------------------------------------------------------------------------------------
2023-10-17 14:12:33,703 EPOCH 2 done: loss 0.1051 - lr: 0.000027
2023-10-17 14:12:37,950 DEV : loss 0.10654985904693604 - f1-score (micro avg) 0.7669
2023-10-17 14:12:37,967 saving best model
2023-10-17 14:12:38,428 ----------------------------------------------------------------------------------------------------
2023-10-17 14:12:45,119 epoch 3 - iter 89/893 - loss 0.06949100 - time (sec): 6.69 - samples/sec: 3371.75 - lr: 0.000026 - momentum: 0.000000
2023-10-17 14:12:51,956 epoch 3 - iter 178/893 - loss 0.06461281 - time (sec): 13.53 - samples/sec: 3555.66 - lr: 0.000026 - momentum: 0.000000
2023-10-17 14:12:58,796 epoch 3 - iter 267/893 - loss 0.06349975 - time (sec): 20.37 - samples/sec: 3594.82 - lr: 0.000026 - momentum: 0.000000
2023-10-17 14:13:05,526 epoch 3 - iter 356/893 - loss 0.06379984 - time (sec): 27.10 - samples/sec: 3608.99 - lr: 0.000025 - momentum: 0.000000
2023-10-17 14:13:12,774 epoch 3 - iter 445/893 - loss 0.06225707 - time (sec): 34.34 - samples/sec: 3558.57 - lr: 0.000025 - momentum: 0.000000
2023-10-17 14:13:19,621 epoch 3 - iter 534/893 - loss 0.06365270 - time (sec): 41.19 - samples/sec: 3538.11 - lr: 0.000025 - momentum: 0.000000
2023-10-17 14:13:26,744 epoch 3 - iter 623/893 - loss 0.06442220 - time (sec): 48.31 - samples/sec: 3563.22 - lr: 0.000024 - momentum: 0.000000
2023-10-17 14:13:33,715 epoch 3 - iter 712/893 - loss 0.06324340 - time (sec): 55.29 - samples/sec: 3579.46 - lr: 0.000024 - momentum: 0.000000
2023-10-17 14:13:40,729 epoch 3 - iter 801/893 - loss 0.06243522 - time (sec): 62.30 - samples/sec: 3596.74 - lr: 0.000024 - momentum: 0.000000
2023-10-17 14:13:48,068 epoch 3 - iter 890/893 - loss 0.06421146 - time (sec): 69.64 - samples/sec: 3560.36 - lr: 0.000023 - momentum: 0.000000
2023-10-17 14:13:48,284 ----------------------------------------------------------------------------------------------------
2023-10-17 14:13:48,285 EPOCH 3 done: loss 0.0643 - lr: 0.000023
2023-10-17 14:13:52,398 DEV : loss 0.11152768135070801 - f1-score (micro avg) 0.8067
2023-10-17 14:13:52,414 saving best model
2023-10-17 14:13:52,855 ----------------------------------------------------------------------------------------------------
2023-10-17 14:13:59,746 epoch 4 - iter 89/893 - loss 0.03606401 - time (sec): 6.89 - samples/sec: 3522.41 - lr: 0.000023 - momentum: 0.000000
2023-10-17 14:14:07,251 epoch 4 - iter 178/893 - loss 0.04211063 - time (sec): 14.39 - samples/sec: 3545.46 - lr: 0.000023 - momentum: 0.000000
2023-10-17 14:14:14,390 epoch 4 - iter 267/893 - loss 0.04187109 - time (sec): 21.53 - samples/sec: 3547.94 - lr: 0.000022 - momentum: 0.000000
2023-10-17 14:14:21,377 epoch 4 - iter 356/893 - loss 0.04455749 - time (sec): 28.52 - samples/sec: 3538.29 - lr: 0.000022 - momentum: 0.000000
2023-10-17 14:14:28,045 epoch 4 - iter 445/893 - loss 0.04511641 - time (sec): 35.19 - samples/sec: 3562.25 - lr: 0.000022 - momentum: 0.000000
2023-10-17 14:14:34,935 epoch 4 - iter 534/893 - loss 0.04363248 - time (sec): 42.08 - samples/sec: 3561.73 - lr: 0.000021 - momentum: 0.000000
2023-10-17 14:14:41,614 epoch 4 - iter 623/893 - loss 0.04457538 - time (sec): 48.75 - samples/sec: 3544.96 - lr: 0.000021 - momentum: 0.000000
2023-10-17 14:14:48,764 epoch 4 - iter 712/893 - loss 0.04527644 - time (sec): 55.90 - samples/sec: 3539.53 - lr: 0.000021 - momentum: 0.000000
2023-10-17 14:14:55,674 epoch 4 - iter 801/893 - loss 0.04599751 - time (sec): 62.81 - samples/sec: 3548.10 - lr: 0.000020 - momentum: 0.000000
2023-10-17 14:15:02,549 epoch 4 - iter 890/893 - loss 0.04579098 - time (sec): 69.69 - samples/sec: 3559.69 - lr: 0.000020 - momentum: 0.000000
2023-10-17 14:15:02,749 ----------------------------------------------------------------------------------------------------
2023-10-17 14:15:02,749 EPOCH 4 done: loss 0.0457 - lr: 0.000020
2023-10-17 14:15:07,472 DEV : loss 0.14700356125831604 - f1-score (micro avg) 0.7955
2023-10-17 14:15:07,490 ----------------------------------------------------------------------------------------------------
2023-10-17 14:15:14,548 epoch 5 - iter 89/893 - loss 0.02159756 - time (sec): 7.06 - samples/sec: 3494.49 - lr: 0.000020 - momentum: 0.000000
2023-10-17 14:15:21,637 epoch 5 - iter 178/893 - loss 0.02588404 - time (sec): 14.15 - samples/sec: 3591.19 - lr: 0.000019 - momentum: 0.000000
2023-10-17 14:15:28,996 epoch 5 - iter 267/893 - loss 0.03084829 - time (sec): 21.51 - samples/sec: 3568.58 - lr: 0.000019 - momentum: 0.000000
2023-10-17 14:15:35,859 epoch 5 - iter 356/893 - loss 0.03225333 - time (sec): 28.37 - samples/sec: 3579.62 - lr: 0.000019 - momentum: 0.000000
2023-10-17 14:15:42,592 epoch 5 - iter 445/893 - loss 0.03370496 - time (sec): 35.10 - samples/sec: 3573.10 - lr: 0.000018 - momentum: 0.000000
2023-10-17 14:15:49,750 epoch 5 - iter 534/893 - loss 0.03350192 - time (sec): 42.26 - samples/sec: 3586.11 - lr: 0.000018 - momentum: 0.000000
2023-10-17 14:15:56,643 epoch 5 - iter 623/893 - loss 0.03401656 - time (sec): 49.15 - samples/sec: 3578.04 - lr: 0.000018 - momentum: 0.000000
2023-10-17 14:16:03,722 epoch 5 - iter 712/893 - loss 0.03477601 - time (sec): 56.23 - samples/sec: 3560.67 - lr: 0.000017 - momentum: 0.000000
2023-10-17 14:16:10,505 epoch 5 - iter 801/893 - loss 0.03446752 - time (sec): 63.01 - samples/sec: 3563.60 - lr: 0.000017 - momentum: 0.000000
2023-10-17 14:16:17,059 epoch 5 - iter 890/893 - loss 0.03506799 - time (sec): 69.57 - samples/sec: 3566.78 - lr: 0.000017 - momentum: 0.000000
2023-10-17 14:16:17,273 ----------------------------------------------------------------------------------------------------
2023-10-17 14:16:17,273 EPOCH 5 done: loss 0.0350 - lr: 0.000017
2023-10-17 14:16:21,422 DEV : loss 0.17064201831817627 - f1-score (micro avg) 0.8171
2023-10-17 14:16:21,439 saving best model
2023-10-17 14:16:21,902 ----------------------------------------------------------------------------------------------------
2023-10-17 14:16:28,780 epoch 6 - iter 89/893 - loss 0.01857424 - time (sec): 6.88 - samples/sec: 3577.99 - lr: 0.000016 - momentum: 0.000000
2023-10-17 14:16:35,905 epoch 6 - iter 178/893 - loss 0.02937566 - time (sec): 14.00 - samples/sec: 3638.01 - lr: 0.000016 - momentum: 0.000000
2023-10-17 14:16:42,905 epoch 6 - iter 267/893 - loss 0.02608151 - time (sec): 21.00 - samples/sec: 3592.78 - lr: 0.000016 - momentum: 0.000000
2023-10-17 14:16:49,575 epoch 6 - iter 356/893 - loss 0.02657817 - time (sec): 27.67 - samples/sec: 3595.01 - lr: 0.000015 - momentum: 0.000000
2023-10-17 14:16:56,500 epoch 6 - iter 445/893 - loss 0.02885372 - time (sec): 34.60 - samples/sec: 3573.37 - lr: 0.000015 - momentum: 0.000000
2023-10-17 14:17:03,373 epoch 6 - iter 534/893 - loss 0.02838909 - time (sec): 41.47 - samples/sec: 3571.61 - lr: 0.000015 - momentum: 0.000000
2023-10-17 14:17:10,284 epoch 6 - iter 623/893 - loss 0.02811999 - time (sec): 48.38 - samples/sec: 3578.24 - lr: 0.000014 - momentum: 0.000000
2023-10-17 14:17:17,204 epoch 6 - iter 712/893 - loss 0.02761914 - time (sec): 55.30 - samples/sec: 3581.39 - lr: 0.000014 - momentum: 0.000000
2023-10-17 14:17:24,288 epoch 6 - iter 801/893 - loss 0.02780382 - time (sec): 62.38 - samples/sec: 3579.86 - lr: 0.000014 - momentum: 0.000000
2023-10-17 14:17:31,228 epoch 6 - iter 890/893 - loss 0.02839401 - time (sec): 69.32 - samples/sec: 3578.28 - lr: 0.000013 - momentum: 0.000000
2023-10-17 14:17:31,447 ----------------------------------------------------------------------------------------------------
2023-10-17 14:17:31,447 EPOCH 6 done: loss 0.0283 - lr: 0.000013
2023-10-17 14:17:36,227 DEV : loss 0.1936318576335907 - f1-score (micro avg) 0.809
2023-10-17 14:17:36,243 ----------------------------------------------------------------------------------------------------
2023-10-17 14:17:43,564 epoch 7 - iter 89/893 - loss 0.02232161 - time (sec): 7.32 - samples/sec: 3518.53 - lr: 0.000013 - momentum: 0.000000
2023-10-17 14:17:50,393 epoch 7 - iter 178/893 - loss 0.02338449 - time (sec): 14.15 - samples/sec: 3547.67 - lr: 0.000013 - momentum: 0.000000
2023-10-17 14:17:57,533 epoch 7 - iter 267/893 - loss 0.02202040 - time (sec): 21.29 - samples/sec: 3505.95 - lr: 0.000012 - momentum: 0.000000
2023-10-17 14:18:04,965 epoch 7 - iter 356/893 - loss 0.02401679 - time (sec): 28.72 - samples/sec: 3515.78 - lr: 0.000012 - momentum: 0.000000
2023-10-17 14:18:11,827 epoch 7 - iter 445/893 - loss 0.02427825 - time (sec): 35.58 - samples/sec: 3528.34 - lr: 0.000012 - momentum: 0.000000
2023-10-17 14:18:19,045 epoch 7 - iter 534/893 - loss 0.02405158 - time (sec): 42.80 - samples/sec: 3525.53 - lr: 0.000011 - momentum: 0.000000
2023-10-17 14:18:25,907 epoch 7 - iter 623/893 - loss 0.02394111 - time (sec): 49.66 - samples/sec: 3532.13 - lr: 0.000011 - momentum: 0.000000
2023-10-17 14:18:32,477 epoch 7 - iter 712/893 - loss 0.02434482 - time (sec): 56.23 - samples/sec: 3531.54 - lr: 0.000011 - momentum: 0.000000
2023-10-17 14:18:39,111 epoch 7 - iter 801/893 - loss 0.02382729 - time (sec): 62.87 - samples/sec: 3549.30 - lr: 0.000010 - momentum: 0.000000
2023-10-17 14:18:46,127 epoch 7 - iter 890/893 - loss 0.02377225 - time (sec): 69.88 - samples/sec: 3552.15 - lr: 0.000010 - momentum: 0.000000
2023-10-17 14:18:46,297 ----------------------------------------------------------------------------------------------------
2023-10-17 14:18:46,298 EPOCH 7 done: loss 0.0238 - lr: 0.000010
2023-10-17 14:18:51,143 DEV : loss 0.19045308232307434 - f1-score (micro avg) 0.8169
2023-10-17 14:18:51,161 ----------------------------------------------------------------------------------------------------
2023-10-17 14:18:58,365 epoch 8 - iter 89/893 - loss 0.01688687 - time (sec): 7.20 - samples/sec: 3308.59 - lr: 0.000010 - momentum: 0.000000
2023-10-17 14:19:05,754 epoch 8 - iter 178/893 - loss 0.01507813 - time (sec): 14.59 - samples/sec: 3414.28 - lr: 0.000009 - momentum: 0.000000
2023-10-17 14:19:12,379 epoch 8 - iter 267/893 - loss 0.01659736 - time (sec): 21.22 - samples/sec: 3411.65 - lr: 0.000009 - momentum: 0.000000
2023-10-17 14:19:19,722 epoch 8 - iter 356/893 - loss 0.01634020 - time (sec): 28.56 - samples/sec: 3417.32 - lr: 0.000009 - momentum: 0.000000
2023-10-17 14:19:26,948 epoch 8 - iter 445/893 - loss 0.01807096 - time (sec): 35.79 - samples/sec: 3447.89 - lr: 0.000008 - momentum: 0.000000
2023-10-17 14:19:33,565 epoch 8 - iter 534/893 - loss 0.01770225 - time (sec): 42.40 - samples/sec: 3499.20 - lr: 0.000008 - momentum: 0.000000
2023-10-17 14:19:40,128 epoch 8 - iter 623/893 - loss 0.01823384 - time (sec): 48.97 - samples/sec: 3525.85 - lr: 0.000008 - momentum: 0.000000
2023-10-17 14:19:46,969 epoch 8 - iter 712/893 - loss 0.01825900 - time (sec): 55.81 - samples/sec: 3519.18 - lr: 0.000007 - momentum: 0.000000
2023-10-17 14:19:53,671 epoch 8 - iter 801/893 - loss 0.01738740 - time (sec): 62.51 - samples/sec: 3527.35 - lr: 0.000007 - momentum: 0.000000
2023-10-17 14:20:01,160 epoch 8 - iter 890/893 - loss 0.01771823 - time (sec): 70.00 - samples/sec: 3541.43 - lr: 0.000007 - momentum: 0.000000
2023-10-17 14:20:01,398 ----------------------------------------------------------------------------------------------------
2023-10-17 14:20:01,398 EPOCH 8 done: loss 0.0177 - lr: 0.000007
2023-10-17 14:20:05,566 DEV : loss 0.192477285861969 - f1-score (micro avg) 0.8288
2023-10-17 14:20:05,583 saving best model
2023-10-17 14:20:06,038 ----------------------------------------------------------------------------------------------------
2023-10-17 14:20:13,064 epoch 9 - iter 89/893 - loss 0.01801821 - time (sec): 7.02 - samples/sec: 3548.33 - lr: 0.000006 - momentum: 0.000000
2023-10-17 14:20:20,095 epoch 9 - iter 178/893 - loss 0.01400252 - time (sec): 14.05 - samples/sec: 3497.31 - lr: 0.000006 - momentum: 0.000000
2023-10-17 14:20:27,466 epoch 9 - iter 267/893 - loss 0.01241543 - time (sec): 21.43 - samples/sec: 3498.11 - lr: 0.000006 - momentum: 0.000000
2023-10-17 14:20:34,094 epoch 9 - iter 356/893 - loss 0.01183902 - time (sec): 28.05 - samples/sec: 3533.40 - lr: 0.000005 - momentum: 0.000000
2023-10-17 14:20:40,711 epoch 9 - iter 445/893 - loss 0.01202647 - time (sec): 34.67 - samples/sec: 3546.02 - lr: 0.000005 - momentum: 0.000000
2023-10-17 14:20:47,463 epoch 9 - iter 534/893 - loss 0.01194551 - time (sec): 41.42 - samples/sec: 3584.17 - lr: 0.000005 - momentum: 0.000000
2023-10-17 14:20:54,584 epoch 9 - iter 623/893 - loss 0.01167993 - time (sec): 48.54 - samples/sec: 3589.57 - lr: 0.000004 - momentum: 0.000000
2023-10-17 14:21:01,573 epoch 9 - iter 712/893 - loss 0.01218580 - time (sec): 55.53 - samples/sec: 3591.70 - lr: 0.000004 - momentum: 0.000000
2023-10-17 14:21:08,429 epoch 9 - iter 801/893 - loss 0.01218316 - time (sec): 62.39 - samples/sec: 3573.44 - lr: 0.000004 - momentum: 0.000000
2023-10-17 14:21:15,572 epoch 9 - iter 890/893 - loss 0.01221895 - time (sec): 69.53 - samples/sec: 3566.01 - lr: 0.000003 - momentum: 0.000000
2023-10-17 14:21:15,802 ----------------------------------------------------------------------------------------------------
2023-10-17 14:21:15,802 EPOCH 9 done: loss 0.0122 - lr: 0.000003
2023-10-17 14:21:20,642 DEV : loss 0.20407408475875854 - f1-score (micro avg) 0.8209
2023-10-17 14:21:20,661 ----------------------------------------------------------------------------------------------------
2023-10-17 14:21:27,776 epoch 10 - iter 89/893 - loss 0.00971997 - time (sec): 7.11 - samples/sec: 3615.69 - lr: 0.000003 - momentum: 0.000000
2023-10-17 14:21:34,502 epoch 10 - iter 178/893 - loss 0.00955405 - time (sec): 13.84 - samples/sec: 3536.48 - lr: 0.000003 - momentum: 0.000000
2023-10-17 14:21:41,337 epoch 10 - iter 267/893 - loss 0.00944326 - time (sec): 20.68 - samples/sec: 3600.81 - lr: 0.000002 - momentum: 0.000000
2023-10-17 14:21:47,925 epoch 10 - iter 356/893 - loss 0.00930071 - time (sec): 27.26 - samples/sec: 3574.09 - lr: 0.000002 - momentum: 0.000000
2023-10-17 14:21:55,162 epoch 10 - iter 445/893 - loss 0.00925717 - time (sec): 34.50 - samples/sec: 3549.48 - lr: 0.000002 - momentum: 0.000000
2023-10-17 14:22:01,938 epoch 10 - iter 534/893 - loss 0.00902192 - time (sec): 41.28 - samples/sec: 3535.95 - lr: 0.000001 - momentum: 0.000000
2023-10-17 14:22:09,683 epoch 10 - iter 623/893 - loss 0.00924880 - time (sec): 49.02 - samples/sec: 3531.34 - lr: 0.000001 - momentum: 0.000000
2023-10-17 14:22:16,544 epoch 10 - iter 712/893 - loss 0.00949301 - time (sec): 55.88 - samples/sec: 3530.79 - lr: 0.000001 - momentum: 0.000000
2023-10-17 14:22:23,898 epoch 10 - iter 801/893 - loss 0.00974846 - time (sec): 63.24 - samples/sec: 3537.12 - lr: 0.000000 - momentum: 0.000000
2023-10-17 14:22:30,922 epoch 10 - iter 890/893 - loss 0.00937198 - time (sec): 70.26 - samples/sec: 3530.47 - lr: 0.000000 - momentum: 0.000000
2023-10-17 14:22:31,113 ----------------------------------------------------------------------------------------------------
2023-10-17 14:22:31,113 EPOCH 10 done: loss 0.0094 - lr: 0.000000
2023-10-17 14:22:35,808 DEV : loss 0.21412597596645355 - f1-score (micro avg) 0.8173
2023-10-17 14:22:36,161 ----------------------------------------------------------------------------------------------------
2023-10-17 14:22:36,162 Loading model from best epoch ...
2023-10-17 14:22:37,524 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 14:22:47,325
Results:
- F-score (micro) 0.7081
- F-score (macro) 0.6378
- Accuracy 0.5674
By class:
precision recall f1-score support
LOC 0.7379 0.6941 0.7153 1095
PER 0.8155 0.7777 0.7962 1012
ORG 0.4339 0.6162 0.5093 357
HumanProd 0.4000 0.7879 0.5306 33
micro avg 0.6985 0.7181 0.7081 2497
macro avg 0.5968 0.7190 0.6378 2497
weighted avg 0.7214 0.7181 0.7162 2497
2023-10-17 14:22:47,325 ----------------------------------------------------------------------------------------------------