stefan-it's picture
Upload folder using huggingface_hub
8838757
2023-10-17 15:09:20,533 ----------------------------------------------------------------------------------------------------
2023-10-17 15:09:20,535 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 15:09:20,535 ----------------------------------------------------------------------------------------------------
2023-10-17 15:09:20,535 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-17 15:09:20,535 ----------------------------------------------------------------------------------------------------
2023-10-17 15:09:20,535 Train: 7142 sentences
2023-10-17 15:09:20,535 (train_with_dev=False, train_with_test=False)
2023-10-17 15:09:20,535 ----------------------------------------------------------------------------------------------------
2023-10-17 15:09:20,535 Training Params:
2023-10-17 15:09:20,535 - learning_rate: "3e-05"
2023-10-17 15:09:20,535 - mini_batch_size: "8"
2023-10-17 15:09:20,535 - max_epochs: "10"
2023-10-17 15:09:20,535 - shuffle: "True"
2023-10-17 15:09:20,535 ----------------------------------------------------------------------------------------------------
2023-10-17 15:09:20,536 Plugins:
2023-10-17 15:09:20,536 - TensorboardLogger
2023-10-17 15:09:20,536 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 15:09:20,536 ----------------------------------------------------------------------------------------------------
2023-10-17 15:09:20,536 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 15:09:20,536 - metric: "('micro avg', 'f1-score')"
2023-10-17 15:09:20,536 ----------------------------------------------------------------------------------------------------
2023-10-17 15:09:20,536 Computation:
2023-10-17 15:09:20,536 - compute on device: cuda:0
2023-10-17 15:09:20,536 - embedding storage: none
2023-10-17 15:09:20,536 ----------------------------------------------------------------------------------------------------
2023-10-17 15:09:20,536 Model training base path: "hmbench-newseye/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 15:09:20,536 ----------------------------------------------------------------------------------------------------
2023-10-17 15:09:20,536 ----------------------------------------------------------------------------------------------------
2023-10-17 15:09:20,536 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 15:09:27,546 epoch 1 - iter 89/893 - loss 3.17468277 - time (sec): 7.01 - samples/sec: 3610.80 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:09:34,586 epoch 1 - iter 178/893 - loss 2.10197739 - time (sec): 14.05 - samples/sec: 3591.62 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:09:41,428 epoch 1 - iter 267/893 - loss 1.58385290 - time (sec): 20.89 - samples/sec: 3551.93 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:09:48,238 epoch 1 - iter 356/893 - loss 1.29295235 - time (sec): 27.70 - samples/sec: 3536.72 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:09:54,809 epoch 1 - iter 445/893 - loss 1.10083788 - time (sec): 34.27 - samples/sec: 3532.97 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:10:01,485 epoch 1 - iter 534/893 - loss 0.95582211 - time (sec): 40.95 - samples/sec: 3558.63 - lr: 0.000018 - momentum: 0.000000
2023-10-17 15:10:09,028 epoch 1 - iter 623/893 - loss 0.84747117 - time (sec): 48.49 - samples/sec: 3514.68 - lr: 0.000021 - momentum: 0.000000
2023-10-17 15:10:16,180 epoch 1 - iter 712/893 - loss 0.75381137 - time (sec): 55.64 - samples/sec: 3538.81 - lr: 0.000024 - momentum: 0.000000
2023-10-17 15:10:23,433 epoch 1 - iter 801/893 - loss 0.68340637 - time (sec): 62.90 - samples/sec: 3547.95 - lr: 0.000027 - momentum: 0.000000
2023-10-17 15:10:30,490 epoch 1 - iter 890/893 - loss 0.63034171 - time (sec): 69.95 - samples/sec: 3548.03 - lr: 0.000030 - momentum: 0.000000
2023-10-17 15:10:30,661 ----------------------------------------------------------------------------------------------------
2023-10-17 15:10:30,661 EPOCH 1 done: loss 0.6293 - lr: 0.000030
2023-10-17 15:10:33,451 DEV : loss 0.11842236667871475 - f1-score (micro avg) 0.7256
2023-10-17 15:10:33,469 saving best model
2023-10-17 15:10:33,816 ----------------------------------------------------------------------------------------------------
2023-10-17 15:10:41,206 epoch 2 - iter 89/893 - loss 0.12232714 - time (sec): 7.39 - samples/sec: 3748.60 - lr: 0.000030 - momentum: 0.000000
2023-10-17 15:10:48,052 epoch 2 - iter 178/893 - loss 0.11680102 - time (sec): 14.23 - samples/sec: 3631.43 - lr: 0.000029 - momentum: 0.000000
2023-10-17 15:10:54,953 epoch 2 - iter 267/893 - loss 0.11328328 - time (sec): 21.14 - samples/sec: 3618.39 - lr: 0.000029 - momentum: 0.000000
2023-10-17 15:11:01,904 epoch 2 - iter 356/893 - loss 0.11241774 - time (sec): 28.09 - samples/sec: 3575.71 - lr: 0.000029 - momentum: 0.000000
2023-10-17 15:11:08,351 epoch 2 - iter 445/893 - loss 0.11060263 - time (sec): 34.53 - samples/sec: 3583.85 - lr: 0.000028 - momentum: 0.000000
2023-10-17 15:11:15,113 epoch 2 - iter 534/893 - loss 0.11047951 - time (sec): 41.30 - samples/sec: 3576.47 - lr: 0.000028 - momentum: 0.000000
2023-10-17 15:11:22,245 epoch 2 - iter 623/893 - loss 0.11086841 - time (sec): 48.43 - samples/sec: 3549.93 - lr: 0.000028 - momentum: 0.000000
2023-10-17 15:11:29,735 epoch 2 - iter 712/893 - loss 0.10835554 - time (sec): 55.92 - samples/sec: 3542.40 - lr: 0.000027 - momentum: 0.000000
2023-10-17 15:11:36,607 epoch 2 - iter 801/893 - loss 0.10647861 - time (sec): 62.79 - samples/sec: 3542.10 - lr: 0.000027 - momentum: 0.000000
2023-10-17 15:11:44,089 epoch 2 - iter 890/893 - loss 0.10520436 - time (sec): 70.27 - samples/sec: 3533.00 - lr: 0.000027 - momentum: 0.000000
2023-10-17 15:11:44,268 ----------------------------------------------------------------------------------------------------
2023-10-17 15:11:44,268 EPOCH 2 done: loss 0.1053 - lr: 0.000027
2023-10-17 15:11:49,317 DEV : loss 0.10727142542600632 - f1-score (micro avg) 0.7891
2023-10-17 15:11:49,336 saving best model
2023-10-17 15:11:49,791 ----------------------------------------------------------------------------------------------------
2023-10-17 15:11:56,578 epoch 3 - iter 89/893 - loss 0.07446046 - time (sec): 6.79 - samples/sec: 3584.31 - lr: 0.000026 - momentum: 0.000000
2023-10-17 15:12:03,419 epoch 3 - iter 178/893 - loss 0.07112677 - time (sec): 13.63 - samples/sec: 3654.95 - lr: 0.000026 - momentum: 0.000000
2023-10-17 15:12:11,006 epoch 3 - iter 267/893 - loss 0.06742536 - time (sec): 21.21 - samples/sec: 3644.25 - lr: 0.000026 - momentum: 0.000000
2023-10-17 15:12:17,869 epoch 3 - iter 356/893 - loss 0.06600880 - time (sec): 28.08 - samples/sec: 3630.84 - lr: 0.000025 - momentum: 0.000000
2023-10-17 15:12:24,959 epoch 3 - iter 445/893 - loss 0.06825375 - time (sec): 35.17 - samples/sec: 3645.98 - lr: 0.000025 - momentum: 0.000000
2023-10-17 15:12:31,281 epoch 3 - iter 534/893 - loss 0.06844461 - time (sec): 41.49 - samples/sec: 3627.52 - lr: 0.000025 - momentum: 0.000000
2023-10-17 15:12:37,870 epoch 3 - iter 623/893 - loss 0.06742613 - time (sec): 48.08 - samples/sec: 3612.75 - lr: 0.000024 - momentum: 0.000000
2023-10-17 15:12:45,061 epoch 3 - iter 712/893 - loss 0.06721754 - time (sec): 55.27 - samples/sec: 3605.20 - lr: 0.000024 - momentum: 0.000000
2023-10-17 15:12:52,659 epoch 3 - iter 801/893 - loss 0.06717275 - time (sec): 62.87 - samples/sec: 3579.35 - lr: 0.000024 - momentum: 0.000000
2023-10-17 15:12:59,217 epoch 3 - iter 890/893 - loss 0.06707278 - time (sec): 69.42 - samples/sec: 3571.54 - lr: 0.000023 - momentum: 0.000000
2023-10-17 15:12:59,452 ----------------------------------------------------------------------------------------------------
2023-10-17 15:12:59,452 EPOCH 3 done: loss 0.0670 - lr: 0.000023
2023-10-17 15:13:04,383 DEV : loss 0.1304038017988205 - f1-score (micro avg) 0.7963
2023-10-17 15:13:04,400 saving best model
2023-10-17 15:13:04,853 ----------------------------------------------------------------------------------------------------
2023-10-17 15:13:12,174 epoch 4 - iter 89/893 - loss 0.04868423 - time (sec): 7.32 - samples/sec: 3502.35 - lr: 0.000023 - momentum: 0.000000
2023-10-17 15:13:19,287 epoch 4 - iter 178/893 - loss 0.04241356 - time (sec): 14.43 - samples/sec: 3520.55 - lr: 0.000023 - momentum: 0.000000
2023-10-17 15:13:26,239 epoch 4 - iter 267/893 - loss 0.04403277 - time (sec): 21.38 - samples/sec: 3544.00 - lr: 0.000022 - momentum: 0.000000
2023-10-17 15:13:32,878 epoch 4 - iter 356/893 - loss 0.04443889 - time (sec): 28.02 - samples/sec: 3549.20 - lr: 0.000022 - momentum: 0.000000
2023-10-17 15:13:39,966 epoch 4 - iter 445/893 - loss 0.04684561 - time (sec): 35.11 - samples/sec: 3518.63 - lr: 0.000022 - momentum: 0.000000
2023-10-17 15:13:46,763 epoch 4 - iter 534/893 - loss 0.04726451 - time (sec): 41.91 - samples/sec: 3522.77 - lr: 0.000021 - momentum: 0.000000
2023-10-17 15:13:54,015 epoch 4 - iter 623/893 - loss 0.04633020 - time (sec): 49.16 - samples/sec: 3525.55 - lr: 0.000021 - momentum: 0.000000
2023-10-17 15:14:01,146 epoch 4 - iter 712/893 - loss 0.04752612 - time (sec): 56.29 - samples/sec: 3522.62 - lr: 0.000021 - momentum: 0.000000
2023-10-17 15:14:08,348 epoch 4 - iter 801/893 - loss 0.04742352 - time (sec): 63.49 - samples/sec: 3521.82 - lr: 0.000020 - momentum: 0.000000
2023-10-17 15:14:15,272 epoch 4 - iter 890/893 - loss 0.04712026 - time (sec): 70.42 - samples/sec: 3519.42 - lr: 0.000020 - momentum: 0.000000
2023-10-17 15:14:15,544 ----------------------------------------------------------------------------------------------------
2023-10-17 15:14:15,544 EPOCH 4 done: loss 0.0472 - lr: 0.000020
2023-10-17 15:14:19,791 DEV : loss 0.1480644792318344 - f1-score (micro avg) 0.822
2023-10-17 15:14:19,808 saving best model
2023-10-17 15:14:20,262 ----------------------------------------------------------------------------------------------------
2023-10-17 15:14:27,481 epoch 5 - iter 89/893 - loss 0.02697342 - time (sec): 7.21 - samples/sec: 3458.11 - lr: 0.000020 - momentum: 0.000000
2023-10-17 15:14:34,191 epoch 5 - iter 178/893 - loss 0.02931270 - time (sec): 13.92 - samples/sec: 3520.13 - lr: 0.000019 - momentum: 0.000000
2023-10-17 15:14:41,111 epoch 5 - iter 267/893 - loss 0.03360478 - time (sec): 20.84 - samples/sec: 3527.20 - lr: 0.000019 - momentum: 0.000000
2023-10-17 15:14:47,741 epoch 5 - iter 356/893 - loss 0.03435094 - time (sec): 27.48 - samples/sec: 3527.12 - lr: 0.000019 - momentum: 0.000000
2023-10-17 15:14:55,013 epoch 5 - iter 445/893 - loss 0.03355795 - time (sec): 34.75 - samples/sec: 3491.76 - lr: 0.000018 - momentum: 0.000000
2023-10-17 15:15:02,081 epoch 5 - iter 534/893 - loss 0.03507889 - time (sec): 41.81 - samples/sec: 3504.28 - lr: 0.000018 - momentum: 0.000000
2023-10-17 15:15:09,134 epoch 5 - iter 623/893 - loss 0.03463081 - time (sec): 48.87 - samples/sec: 3519.94 - lr: 0.000018 - momentum: 0.000000
2023-10-17 15:15:16,208 epoch 5 - iter 712/893 - loss 0.03474055 - time (sec): 55.94 - samples/sec: 3524.52 - lr: 0.000017 - momentum: 0.000000
2023-10-17 15:15:23,557 epoch 5 - iter 801/893 - loss 0.03527026 - time (sec): 63.29 - samples/sec: 3527.17 - lr: 0.000017 - momentum: 0.000000
2023-10-17 15:15:30,470 epoch 5 - iter 890/893 - loss 0.03498960 - time (sec): 70.20 - samples/sec: 3534.99 - lr: 0.000017 - momentum: 0.000000
2023-10-17 15:15:30,640 ----------------------------------------------------------------------------------------------------
2023-10-17 15:15:30,640 EPOCH 5 done: loss 0.0349 - lr: 0.000017
2023-10-17 15:15:35,393 DEV : loss 0.15961149334907532 - f1-score (micro avg) 0.8035
2023-10-17 15:15:35,410 ----------------------------------------------------------------------------------------------------
2023-10-17 15:15:42,408 epoch 6 - iter 89/893 - loss 0.02420361 - time (sec): 7.00 - samples/sec: 3548.22 - lr: 0.000016 - momentum: 0.000000
2023-10-17 15:15:48,903 epoch 6 - iter 178/893 - loss 0.02406819 - time (sec): 13.49 - samples/sec: 3543.80 - lr: 0.000016 - momentum: 0.000000
2023-10-17 15:15:56,114 epoch 6 - iter 267/893 - loss 0.02420778 - time (sec): 20.70 - samples/sec: 3530.79 - lr: 0.000016 - momentum: 0.000000
2023-10-17 15:16:03,630 epoch 6 - iter 356/893 - loss 0.02473679 - time (sec): 28.22 - samples/sec: 3491.53 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:16:10,515 epoch 6 - iter 445/893 - loss 0.02601161 - time (sec): 35.10 - samples/sec: 3511.49 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:16:17,643 epoch 6 - iter 534/893 - loss 0.02577712 - time (sec): 42.23 - samples/sec: 3540.32 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:16:24,783 epoch 6 - iter 623/893 - loss 0.02608412 - time (sec): 49.37 - samples/sec: 3539.28 - lr: 0.000014 - momentum: 0.000000
2023-10-17 15:16:31,686 epoch 6 - iter 712/893 - loss 0.02665696 - time (sec): 56.27 - samples/sec: 3545.51 - lr: 0.000014 - momentum: 0.000000
2023-10-17 15:16:38,520 epoch 6 - iter 801/893 - loss 0.02745268 - time (sec): 63.11 - samples/sec: 3553.56 - lr: 0.000014 - momentum: 0.000000
2023-10-17 15:16:45,563 epoch 6 - iter 890/893 - loss 0.02802573 - time (sec): 70.15 - samples/sec: 3535.40 - lr: 0.000013 - momentum: 0.000000
2023-10-17 15:16:45,762 ----------------------------------------------------------------------------------------------------
2023-10-17 15:16:45,762 EPOCH 6 done: loss 0.0280 - lr: 0.000013
2023-10-17 15:16:50,006 DEV : loss 0.17348243296146393 - f1-score (micro avg) 0.8118
2023-10-17 15:16:50,025 ----------------------------------------------------------------------------------------------------
2023-10-17 15:16:57,708 epoch 7 - iter 89/893 - loss 0.01778190 - time (sec): 7.68 - samples/sec: 3402.72 - lr: 0.000013 - momentum: 0.000000
2023-10-17 15:17:04,684 epoch 7 - iter 178/893 - loss 0.01956122 - time (sec): 14.66 - samples/sec: 3453.95 - lr: 0.000013 - momentum: 0.000000
2023-10-17 15:17:11,587 epoch 7 - iter 267/893 - loss 0.01856255 - time (sec): 21.56 - samples/sec: 3439.09 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:17:18,586 epoch 7 - iter 356/893 - loss 0.01988639 - time (sec): 28.56 - samples/sec: 3483.76 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:17:25,653 epoch 7 - iter 445/893 - loss 0.01938792 - time (sec): 35.63 - samples/sec: 3491.34 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:17:32,453 epoch 7 - iter 534/893 - loss 0.02095563 - time (sec): 42.43 - samples/sec: 3513.52 - lr: 0.000011 - momentum: 0.000000
2023-10-17 15:17:39,259 epoch 7 - iter 623/893 - loss 0.02165179 - time (sec): 49.23 - samples/sec: 3520.41 - lr: 0.000011 - momentum: 0.000000
2023-10-17 15:17:46,264 epoch 7 - iter 712/893 - loss 0.02144328 - time (sec): 56.24 - samples/sec: 3506.66 - lr: 0.000011 - momentum: 0.000000
2023-10-17 15:17:53,665 epoch 7 - iter 801/893 - loss 0.02112030 - time (sec): 63.64 - samples/sec: 3505.50 - lr: 0.000010 - momentum: 0.000000
2023-10-17 15:18:00,568 epoch 7 - iter 890/893 - loss 0.02096110 - time (sec): 70.54 - samples/sec: 3519.21 - lr: 0.000010 - momentum: 0.000000
2023-10-17 15:18:00,782 ----------------------------------------------------------------------------------------------------
2023-10-17 15:18:00,782 EPOCH 7 done: loss 0.0211 - lr: 0.000010
2023-10-17 15:18:05,006 DEV : loss 0.19632981717586517 - f1-score (micro avg) 0.8309
2023-10-17 15:18:05,022 saving best model
2023-10-17 15:18:05,530 ----------------------------------------------------------------------------------------------------
2023-10-17 15:18:12,423 epoch 8 - iter 89/893 - loss 0.01429665 - time (sec): 6.89 - samples/sec: 3489.66 - lr: 0.000010 - momentum: 0.000000
2023-10-17 15:18:19,109 epoch 8 - iter 178/893 - loss 0.01707880 - time (sec): 13.58 - samples/sec: 3533.97 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:18:26,443 epoch 8 - iter 267/893 - loss 0.01664126 - time (sec): 20.91 - samples/sec: 3503.31 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:18:33,523 epoch 8 - iter 356/893 - loss 0.01557798 - time (sec): 27.99 - samples/sec: 3547.51 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:18:41,081 epoch 8 - iter 445/893 - loss 0.01559462 - time (sec): 35.55 - samples/sec: 3562.24 - lr: 0.000008 - momentum: 0.000000
2023-10-17 15:18:48,096 epoch 8 - iter 534/893 - loss 0.01483953 - time (sec): 42.56 - samples/sec: 3576.21 - lr: 0.000008 - momentum: 0.000000
2023-10-17 15:18:55,230 epoch 8 - iter 623/893 - loss 0.01549873 - time (sec): 49.70 - samples/sec: 3559.73 - lr: 0.000008 - momentum: 0.000000
2023-10-17 15:19:02,083 epoch 8 - iter 712/893 - loss 0.01562235 - time (sec): 56.55 - samples/sec: 3547.03 - lr: 0.000007 - momentum: 0.000000
2023-10-17 15:19:08,926 epoch 8 - iter 801/893 - loss 0.01515266 - time (sec): 63.39 - samples/sec: 3543.06 - lr: 0.000007 - momentum: 0.000000
2023-10-17 15:19:15,547 epoch 8 - iter 890/893 - loss 0.01524696 - time (sec): 70.01 - samples/sec: 3538.74 - lr: 0.000007 - momentum: 0.000000
2023-10-17 15:19:15,834 ----------------------------------------------------------------------------------------------------
2023-10-17 15:19:15,834 EPOCH 8 done: loss 0.0152 - lr: 0.000007
2023-10-17 15:19:21,293 DEV : loss 0.20254144072532654 - f1-score (micro avg) 0.8268
2023-10-17 15:19:21,322 ----------------------------------------------------------------------------------------------------
2023-10-17 15:19:28,171 epoch 9 - iter 89/893 - loss 0.01120597 - time (sec): 6.85 - samples/sec: 3518.61 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:19:34,750 epoch 9 - iter 178/893 - loss 0.01273785 - time (sec): 13.43 - samples/sec: 3581.27 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:19:41,616 epoch 9 - iter 267/893 - loss 0.01274289 - time (sec): 20.29 - samples/sec: 3581.05 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:19:48,297 epoch 9 - iter 356/893 - loss 0.01222354 - time (sec): 26.97 - samples/sec: 3590.32 - lr: 0.000005 - momentum: 0.000000
2023-10-17 15:19:55,627 epoch 9 - iter 445/893 - loss 0.01162771 - time (sec): 34.30 - samples/sec: 3585.72 - lr: 0.000005 - momentum: 0.000000
2023-10-17 15:20:02,456 epoch 9 - iter 534/893 - loss 0.01219240 - time (sec): 41.13 - samples/sec: 3617.29 - lr: 0.000005 - momentum: 0.000000
2023-10-17 15:20:09,469 epoch 9 - iter 623/893 - loss 0.01234197 - time (sec): 48.15 - samples/sec: 3586.34 - lr: 0.000004 - momentum: 0.000000
2023-10-17 15:20:16,331 epoch 9 - iter 712/893 - loss 0.01245901 - time (sec): 55.01 - samples/sec: 3593.61 - lr: 0.000004 - momentum: 0.000000
2023-10-17 15:20:23,343 epoch 9 - iter 801/893 - loss 0.01235256 - time (sec): 62.02 - samples/sec: 3585.82 - lr: 0.000004 - momentum: 0.000000
2023-10-17 15:20:31,178 epoch 9 - iter 890/893 - loss 0.01174338 - time (sec): 69.85 - samples/sec: 3547.85 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:20:31,426 ----------------------------------------------------------------------------------------------------
2023-10-17 15:20:31,426 EPOCH 9 done: loss 0.0117 - lr: 0.000003
2023-10-17 15:20:35,866 DEV : loss 0.21009869873523712 - f1-score (micro avg) 0.8231
2023-10-17 15:20:35,892 ----------------------------------------------------------------------------------------------------
2023-10-17 15:20:44,383 epoch 10 - iter 89/893 - loss 0.01168285 - time (sec): 8.49 - samples/sec: 2852.25 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:20:51,070 epoch 10 - iter 178/893 - loss 0.00956698 - time (sec): 15.18 - samples/sec: 3187.54 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:20:58,430 epoch 10 - iter 267/893 - loss 0.00810400 - time (sec): 22.54 - samples/sec: 3286.46 - lr: 0.000002 - momentum: 0.000000
2023-10-17 15:21:05,344 epoch 10 - iter 356/893 - loss 0.00938662 - time (sec): 29.45 - samples/sec: 3290.52 - lr: 0.000002 - momentum: 0.000000
2023-10-17 15:21:12,268 epoch 10 - iter 445/893 - loss 0.00895064 - time (sec): 36.37 - samples/sec: 3304.60 - lr: 0.000002 - momentum: 0.000000
2023-10-17 15:21:19,480 epoch 10 - iter 534/893 - loss 0.00957070 - time (sec): 43.58 - samples/sec: 3335.18 - lr: 0.000001 - momentum: 0.000000
2023-10-17 15:21:26,763 epoch 10 - iter 623/893 - loss 0.00910774 - time (sec): 50.87 - samples/sec: 3355.39 - lr: 0.000001 - momentum: 0.000000
2023-10-17 15:21:33,995 epoch 10 - iter 712/893 - loss 0.00887038 - time (sec): 58.10 - samples/sec: 3359.76 - lr: 0.000001 - momentum: 0.000000
2023-10-17 15:21:41,182 epoch 10 - iter 801/893 - loss 0.00857775 - time (sec): 65.29 - samples/sec: 3378.07 - lr: 0.000000 - momentum: 0.000000
2023-10-17 15:21:48,658 epoch 10 - iter 890/893 - loss 0.00850610 - time (sec): 72.76 - samples/sec: 3408.45 - lr: 0.000000 - momentum: 0.000000
2023-10-17 15:21:48,903 ----------------------------------------------------------------------------------------------------
2023-10-17 15:21:48,903 EPOCH 10 done: loss 0.0085 - lr: 0.000000
2023-10-17 15:21:53,236 DEV : loss 0.207401305437088 - f1-score (micro avg) 0.8282
2023-10-17 15:21:53,620 ----------------------------------------------------------------------------------------------------
2023-10-17 15:21:53,622 Loading model from best epoch ...
2023-10-17 15:21:54,979 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 15:22:05,756
Results:
- F-score (micro) 0.7185
- F-score (macro) 0.6326
- Accuracy 0.5744
By class:
precision recall f1-score support
LOC 0.7239 0.7397 0.7317 1095
PER 0.7950 0.7816 0.7882 1012
ORG 0.4939 0.5630 0.5262 357
HumanProd 0.3710 0.6970 0.4842 33
micro avg 0.7065 0.7309 0.7185 2497
macro avg 0.5959 0.6953 0.6326 2497
weighted avg 0.7151 0.7309 0.7220 2497
2023-10-17 15:22:05,756 ----------------------------------------------------------------------------------------------------