2023-10-17 14:10:11,532 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:10:11,533 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 14:10:11,533 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:10:11,533 MultiCorpus: 7142 train + 698 dev + 2570 test sentences - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator 2023-10-17 14:10:11,533 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:10:11,533 Train: 7142 sentences 2023-10-17 14:10:11,533 (train_with_dev=False, train_with_test=False) 2023-10-17 14:10:11,533 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:10:11,534 Training Params: 2023-10-17 14:10:11,534 - learning_rate: "3e-05" 2023-10-17 14:10:11,534 - mini_batch_size: "8" 2023-10-17 14:10:11,534 - max_epochs: "10" 2023-10-17 14:10:11,534 - shuffle: "True" 2023-10-17 14:10:11,534 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:10:11,534 Plugins: 2023-10-17 14:10:11,534 - TensorboardLogger 2023-10-17 14:10:11,534 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 14:10:11,534 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:10:11,534 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 14:10:11,534 - metric: "('micro avg', 'f1-score')" 2023-10-17 14:10:11,534 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:10:11,534 Computation: 2023-10-17 14:10:11,534 - compute on device: cuda:0 2023-10-17 14:10:11,534 - embedding storage: none 2023-10-17 14:10:11,534 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:10:11,534 Model training base path: "hmbench-newseye/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 14:10:11,534 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:10:11,534 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:10:11,534 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 14:10:18,101 epoch 1 - iter 89/893 - loss 3.18821826 - time (sec): 6.57 - samples/sec: 3663.50 - lr: 0.000003 - momentum: 0.000000 2023-10-17 14:10:24,900 epoch 1 - iter 178/893 - loss 2.02730331 - time (sec): 13.37 - samples/sec: 3655.29 - lr: 0.000006 - momentum: 0.000000 2023-10-17 14:10:31,279 epoch 1 - iter 267/893 - loss 1.54250315 - time (sec): 19.74 - samples/sec: 3620.84 - lr: 0.000009 - momentum: 0.000000 2023-10-17 14:10:38,176 epoch 1 - iter 356/893 - loss 1.22936425 - time (sec): 26.64 - samples/sec: 3648.31 - lr: 0.000012 - momentum: 0.000000 2023-10-17 14:10:45,399 epoch 1 - iter 445/893 - loss 1.03366668 - time (sec): 33.86 - samples/sec: 3618.90 - lr: 0.000015 - momentum: 0.000000 2023-10-17 14:10:52,409 epoch 1 - iter 534/893 - loss 0.90453755 - time (sec): 40.87 - samples/sec: 3593.88 - lr: 0.000018 - momentum: 0.000000 2023-10-17 14:10:59,662 epoch 1 - iter 623/893 - loss 0.79689601 - time (sec): 48.13 - samples/sec: 3580.57 - lr: 0.000021 - momentum: 0.000000 2023-10-17 14:11:06,903 epoch 1 - iter 712/893 - loss 0.71218918 - time (sec): 55.37 - samples/sec: 3595.84 - lr: 0.000024 - momentum: 0.000000 2023-10-17 14:11:13,804 epoch 1 - iter 801/893 - loss 0.65097764 - time (sec): 62.27 - samples/sec: 3605.59 - lr: 0.000027 - momentum: 0.000000 2023-10-17 14:11:20,259 epoch 1 - iter 890/893 - loss 0.60396316 - time (sec): 68.72 - samples/sec: 3607.87 - lr: 0.000030 - momentum: 0.000000 2023-10-17 14:11:20,494 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:11:20,494 EPOCH 1 done: loss 0.6024 - lr: 0.000030 2023-10-17 14:11:23,692 DEV : loss 0.10485294461250305 - f1-score (micro avg) 0.7213 2023-10-17 14:11:23,708 saving best model 2023-10-17 14:11:24,047 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:11:30,654 epoch 2 - iter 89/893 - loss 0.12053920 - time (sec): 6.61 - samples/sec: 3623.05 - lr: 0.000030 - momentum: 0.000000 2023-10-17 14:11:37,814 epoch 2 - iter 178/893 - loss 0.12089620 - time (sec): 13.77 - samples/sec: 3633.53 - lr: 0.000029 - momentum: 0.000000 2023-10-17 14:11:44,955 epoch 2 - iter 267/893 - loss 0.11493666 - time (sec): 20.91 - samples/sec: 3580.12 - lr: 0.000029 - momentum: 0.000000 2023-10-17 14:11:51,723 epoch 2 - iter 356/893 - loss 0.11530044 - time (sec): 27.67 - samples/sec: 3604.47 - lr: 0.000029 - momentum: 0.000000 2023-10-17 14:11:58,309 epoch 2 - iter 445/893 - loss 0.11276770 - time (sec): 34.26 - samples/sec: 3603.94 - lr: 0.000028 - momentum: 0.000000 2023-10-17 14:12:05,445 epoch 2 - iter 534/893 - loss 0.10981634 - time (sec): 41.40 - samples/sec: 3593.09 - lr: 0.000028 - momentum: 0.000000 2023-10-17 14:12:12,466 epoch 2 - iter 623/893 - loss 0.11043125 - time (sec): 48.42 - samples/sec: 3564.81 - lr: 0.000028 - momentum: 0.000000 2023-10-17 14:12:19,698 epoch 2 - iter 712/893 - loss 0.10651423 - time (sec): 55.65 - samples/sec: 3564.13 - lr: 0.000027 - momentum: 0.000000 2023-10-17 14:12:26,705 epoch 2 - iter 801/893 - loss 0.10556302 - time (sec): 62.66 - samples/sec: 3571.07 - lr: 0.000027 - momentum: 0.000000 2023-10-17 14:12:33,493 epoch 2 - iter 890/893 - loss 0.10507616 - time (sec): 69.44 - samples/sec: 3571.62 - lr: 0.000027 - momentum: 0.000000 2023-10-17 14:12:33,703 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:12:33,703 EPOCH 2 done: loss 0.1051 - lr: 0.000027 2023-10-17 14:12:37,950 DEV : loss 0.10654985904693604 - f1-score (micro avg) 0.7669 2023-10-17 14:12:37,967 saving best model 2023-10-17 14:12:38,428 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:12:45,119 epoch 3 - iter 89/893 - loss 0.06949100 - time (sec): 6.69 - samples/sec: 3371.75 - lr: 0.000026 - momentum: 0.000000 2023-10-17 14:12:51,956 epoch 3 - iter 178/893 - loss 0.06461281 - time (sec): 13.53 - samples/sec: 3555.66 - lr: 0.000026 - momentum: 0.000000 2023-10-17 14:12:58,796 epoch 3 - iter 267/893 - loss 0.06349975 - time (sec): 20.37 - samples/sec: 3594.82 - lr: 0.000026 - momentum: 0.000000 2023-10-17 14:13:05,526 epoch 3 - iter 356/893 - loss 0.06379984 - time (sec): 27.10 - samples/sec: 3608.99 - lr: 0.000025 - momentum: 0.000000 2023-10-17 14:13:12,774 epoch 3 - iter 445/893 - loss 0.06225707 - time (sec): 34.34 - samples/sec: 3558.57 - lr: 0.000025 - momentum: 0.000000 2023-10-17 14:13:19,621 epoch 3 - iter 534/893 - loss 0.06365270 - time (sec): 41.19 - samples/sec: 3538.11 - lr: 0.000025 - momentum: 0.000000 2023-10-17 14:13:26,744 epoch 3 - iter 623/893 - loss 0.06442220 - time (sec): 48.31 - samples/sec: 3563.22 - lr: 0.000024 - momentum: 0.000000 2023-10-17 14:13:33,715 epoch 3 - iter 712/893 - loss 0.06324340 - time (sec): 55.29 - samples/sec: 3579.46 - lr: 0.000024 - momentum: 0.000000 2023-10-17 14:13:40,729 epoch 3 - iter 801/893 - loss 0.06243522 - time (sec): 62.30 - samples/sec: 3596.74 - lr: 0.000024 - momentum: 0.000000 2023-10-17 14:13:48,068 epoch 3 - iter 890/893 - loss 0.06421146 - time (sec): 69.64 - samples/sec: 3560.36 - lr: 0.000023 - momentum: 0.000000 2023-10-17 14:13:48,284 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:13:48,285 EPOCH 3 done: loss 0.0643 - lr: 0.000023 2023-10-17 14:13:52,398 DEV : loss 0.11152768135070801 - f1-score (micro avg) 0.8067 2023-10-17 14:13:52,414 saving best model 2023-10-17 14:13:52,855 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:13:59,746 epoch 4 - iter 89/893 - loss 0.03606401 - time (sec): 6.89 - samples/sec: 3522.41 - lr: 0.000023 - momentum: 0.000000 2023-10-17 14:14:07,251 epoch 4 - iter 178/893 - loss 0.04211063 - time (sec): 14.39 - samples/sec: 3545.46 - lr: 0.000023 - momentum: 0.000000 2023-10-17 14:14:14,390 epoch 4 - iter 267/893 - loss 0.04187109 - time (sec): 21.53 - samples/sec: 3547.94 - lr: 0.000022 - momentum: 0.000000 2023-10-17 14:14:21,377 epoch 4 - iter 356/893 - loss 0.04455749 - time (sec): 28.52 - samples/sec: 3538.29 - lr: 0.000022 - momentum: 0.000000 2023-10-17 14:14:28,045 epoch 4 - iter 445/893 - loss 0.04511641 - time (sec): 35.19 - samples/sec: 3562.25 - lr: 0.000022 - momentum: 0.000000 2023-10-17 14:14:34,935 epoch 4 - iter 534/893 - loss 0.04363248 - time (sec): 42.08 - samples/sec: 3561.73 - lr: 0.000021 - momentum: 0.000000 2023-10-17 14:14:41,614 epoch 4 - iter 623/893 - loss 0.04457538 - time (sec): 48.75 - samples/sec: 3544.96 - lr: 0.000021 - momentum: 0.000000 2023-10-17 14:14:48,764 epoch 4 - iter 712/893 - loss 0.04527644 - time (sec): 55.90 - samples/sec: 3539.53 - lr: 0.000021 - momentum: 0.000000 2023-10-17 14:14:55,674 epoch 4 - iter 801/893 - loss 0.04599751 - time (sec): 62.81 - samples/sec: 3548.10 - lr: 0.000020 - momentum: 0.000000 2023-10-17 14:15:02,549 epoch 4 - iter 890/893 - loss 0.04579098 - time (sec): 69.69 - samples/sec: 3559.69 - lr: 0.000020 - momentum: 0.000000 2023-10-17 14:15:02,749 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:15:02,749 EPOCH 4 done: loss 0.0457 - lr: 0.000020 2023-10-17 14:15:07,472 DEV : loss 0.14700356125831604 - f1-score (micro avg) 0.7955 2023-10-17 14:15:07,490 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:15:14,548 epoch 5 - iter 89/893 - loss 0.02159756 - time (sec): 7.06 - samples/sec: 3494.49 - lr: 0.000020 - momentum: 0.000000 2023-10-17 14:15:21,637 epoch 5 - iter 178/893 - loss 0.02588404 - time (sec): 14.15 - samples/sec: 3591.19 - lr: 0.000019 - momentum: 0.000000 2023-10-17 14:15:28,996 epoch 5 - iter 267/893 - loss 0.03084829 - time (sec): 21.51 - samples/sec: 3568.58 - lr: 0.000019 - momentum: 0.000000 2023-10-17 14:15:35,859 epoch 5 - iter 356/893 - loss 0.03225333 - time (sec): 28.37 - samples/sec: 3579.62 - lr: 0.000019 - momentum: 0.000000 2023-10-17 14:15:42,592 epoch 5 - iter 445/893 - loss 0.03370496 - time (sec): 35.10 - samples/sec: 3573.10 - lr: 0.000018 - momentum: 0.000000 2023-10-17 14:15:49,750 epoch 5 - iter 534/893 - loss 0.03350192 - time (sec): 42.26 - samples/sec: 3586.11 - lr: 0.000018 - momentum: 0.000000 2023-10-17 14:15:56,643 epoch 5 - iter 623/893 - loss 0.03401656 - time (sec): 49.15 - samples/sec: 3578.04 - lr: 0.000018 - momentum: 0.000000 2023-10-17 14:16:03,722 epoch 5 - iter 712/893 - loss 0.03477601 - time (sec): 56.23 - samples/sec: 3560.67 - lr: 0.000017 - momentum: 0.000000 2023-10-17 14:16:10,505 epoch 5 - iter 801/893 - loss 0.03446752 - time (sec): 63.01 - samples/sec: 3563.60 - lr: 0.000017 - momentum: 0.000000 2023-10-17 14:16:17,059 epoch 5 - iter 890/893 - loss 0.03506799 - time (sec): 69.57 - samples/sec: 3566.78 - lr: 0.000017 - momentum: 0.000000 2023-10-17 14:16:17,273 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:16:17,273 EPOCH 5 done: loss 0.0350 - lr: 0.000017 2023-10-17 14:16:21,422 DEV : loss 0.17064201831817627 - f1-score (micro avg) 0.8171 2023-10-17 14:16:21,439 saving best model 2023-10-17 14:16:21,902 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:16:28,780 epoch 6 - iter 89/893 - loss 0.01857424 - time (sec): 6.88 - samples/sec: 3577.99 - lr: 0.000016 - momentum: 0.000000 2023-10-17 14:16:35,905 epoch 6 - iter 178/893 - loss 0.02937566 - time (sec): 14.00 - samples/sec: 3638.01 - lr: 0.000016 - momentum: 0.000000 2023-10-17 14:16:42,905 epoch 6 - iter 267/893 - loss 0.02608151 - time (sec): 21.00 - samples/sec: 3592.78 - lr: 0.000016 - momentum: 0.000000 2023-10-17 14:16:49,575 epoch 6 - iter 356/893 - loss 0.02657817 - time (sec): 27.67 - samples/sec: 3595.01 - lr: 0.000015 - momentum: 0.000000 2023-10-17 14:16:56,500 epoch 6 - iter 445/893 - loss 0.02885372 - time (sec): 34.60 - samples/sec: 3573.37 - lr: 0.000015 - momentum: 0.000000 2023-10-17 14:17:03,373 epoch 6 - iter 534/893 - loss 0.02838909 - time (sec): 41.47 - samples/sec: 3571.61 - lr: 0.000015 - momentum: 0.000000 2023-10-17 14:17:10,284 epoch 6 - iter 623/893 - loss 0.02811999 - time (sec): 48.38 - samples/sec: 3578.24 - lr: 0.000014 - momentum: 0.000000 2023-10-17 14:17:17,204 epoch 6 - iter 712/893 - loss 0.02761914 - time (sec): 55.30 - samples/sec: 3581.39 - lr: 0.000014 - momentum: 0.000000 2023-10-17 14:17:24,288 epoch 6 - iter 801/893 - loss 0.02780382 - time (sec): 62.38 - samples/sec: 3579.86 - lr: 0.000014 - momentum: 0.000000 2023-10-17 14:17:31,228 epoch 6 - iter 890/893 - loss 0.02839401 - time (sec): 69.32 - samples/sec: 3578.28 - lr: 0.000013 - momentum: 0.000000 2023-10-17 14:17:31,447 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:17:31,447 EPOCH 6 done: loss 0.0283 - lr: 0.000013 2023-10-17 14:17:36,227 DEV : loss 0.1936318576335907 - f1-score (micro avg) 0.809 2023-10-17 14:17:36,243 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:17:43,564 epoch 7 - iter 89/893 - loss 0.02232161 - time (sec): 7.32 - samples/sec: 3518.53 - lr: 0.000013 - momentum: 0.000000 2023-10-17 14:17:50,393 epoch 7 - iter 178/893 - loss 0.02338449 - time (sec): 14.15 - samples/sec: 3547.67 - lr: 0.000013 - momentum: 0.000000 2023-10-17 14:17:57,533 epoch 7 - iter 267/893 - loss 0.02202040 - time (sec): 21.29 - samples/sec: 3505.95 - lr: 0.000012 - momentum: 0.000000 2023-10-17 14:18:04,965 epoch 7 - iter 356/893 - loss 0.02401679 - time (sec): 28.72 - samples/sec: 3515.78 - lr: 0.000012 - momentum: 0.000000 2023-10-17 14:18:11,827 epoch 7 - iter 445/893 - loss 0.02427825 - time (sec): 35.58 - samples/sec: 3528.34 - lr: 0.000012 - momentum: 0.000000 2023-10-17 14:18:19,045 epoch 7 - iter 534/893 - loss 0.02405158 - time (sec): 42.80 - samples/sec: 3525.53 - lr: 0.000011 - momentum: 0.000000 2023-10-17 14:18:25,907 epoch 7 - iter 623/893 - loss 0.02394111 - time (sec): 49.66 - samples/sec: 3532.13 - lr: 0.000011 - momentum: 0.000000 2023-10-17 14:18:32,477 epoch 7 - iter 712/893 - loss 0.02434482 - time (sec): 56.23 - samples/sec: 3531.54 - lr: 0.000011 - momentum: 0.000000 2023-10-17 14:18:39,111 epoch 7 - iter 801/893 - loss 0.02382729 - time (sec): 62.87 - samples/sec: 3549.30 - lr: 0.000010 - momentum: 0.000000 2023-10-17 14:18:46,127 epoch 7 - iter 890/893 - loss 0.02377225 - time (sec): 69.88 - samples/sec: 3552.15 - lr: 0.000010 - momentum: 0.000000 2023-10-17 14:18:46,297 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:18:46,298 EPOCH 7 done: loss 0.0238 - lr: 0.000010 2023-10-17 14:18:51,143 DEV : loss 0.19045308232307434 - f1-score (micro avg) 0.8169 2023-10-17 14:18:51,161 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:18:58,365 epoch 8 - iter 89/893 - loss 0.01688687 - time (sec): 7.20 - samples/sec: 3308.59 - lr: 0.000010 - momentum: 0.000000 2023-10-17 14:19:05,754 epoch 8 - iter 178/893 - loss 0.01507813 - time (sec): 14.59 - samples/sec: 3414.28 - lr: 0.000009 - momentum: 0.000000 2023-10-17 14:19:12,379 epoch 8 - iter 267/893 - loss 0.01659736 - time (sec): 21.22 - samples/sec: 3411.65 - lr: 0.000009 - momentum: 0.000000 2023-10-17 14:19:19,722 epoch 8 - iter 356/893 - loss 0.01634020 - time (sec): 28.56 - samples/sec: 3417.32 - lr: 0.000009 - momentum: 0.000000 2023-10-17 14:19:26,948 epoch 8 - iter 445/893 - loss 0.01807096 - time (sec): 35.79 - samples/sec: 3447.89 - lr: 0.000008 - momentum: 0.000000 2023-10-17 14:19:33,565 epoch 8 - iter 534/893 - loss 0.01770225 - time (sec): 42.40 - samples/sec: 3499.20 - lr: 0.000008 - momentum: 0.000000 2023-10-17 14:19:40,128 epoch 8 - iter 623/893 - loss 0.01823384 - time (sec): 48.97 - samples/sec: 3525.85 - lr: 0.000008 - momentum: 0.000000 2023-10-17 14:19:46,969 epoch 8 - iter 712/893 - loss 0.01825900 - time (sec): 55.81 - samples/sec: 3519.18 - lr: 0.000007 - momentum: 0.000000 2023-10-17 14:19:53,671 epoch 8 - iter 801/893 - loss 0.01738740 - time (sec): 62.51 - samples/sec: 3527.35 - lr: 0.000007 - momentum: 0.000000 2023-10-17 14:20:01,160 epoch 8 - iter 890/893 - loss 0.01771823 - time (sec): 70.00 - samples/sec: 3541.43 - lr: 0.000007 - momentum: 0.000000 2023-10-17 14:20:01,398 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:20:01,398 EPOCH 8 done: loss 0.0177 - lr: 0.000007 2023-10-17 14:20:05,566 DEV : loss 0.192477285861969 - f1-score (micro avg) 0.8288 2023-10-17 14:20:05,583 saving best model 2023-10-17 14:20:06,038 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:20:13,064 epoch 9 - iter 89/893 - loss 0.01801821 - time (sec): 7.02 - samples/sec: 3548.33 - lr: 0.000006 - momentum: 0.000000 2023-10-17 14:20:20,095 epoch 9 - iter 178/893 - loss 0.01400252 - time (sec): 14.05 - samples/sec: 3497.31 - lr: 0.000006 - momentum: 0.000000 2023-10-17 14:20:27,466 epoch 9 - iter 267/893 - loss 0.01241543 - time (sec): 21.43 - samples/sec: 3498.11 - lr: 0.000006 - momentum: 0.000000 2023-10-17 14:20:34,094 epoch 9 - iter 356/893 - loss 0.01183902 - time (sec): 28.05 - samples/sec: 3533.40 - lr: 0.000005 - momentum: 0.000000 2023-10-17 14:20:40,711 epoch 9 - iter 445/893 - loss 0.01202647 - time (sec): 34.67 - samples/sec: 3546.02 - lr: 0.000005 - momentum: 0.000000 2023-10-17 14:20:47,463 epoch 9 - iter 534/893 - loss 0.01194551 - time (sec): 41.42 - samples/sec: 3584.17 - lr: 0.000005 - momentum: 0.000000 2023-10-17 14:20:54,584 epoch 9 - iter 623/893 - loss 0.01167993 - time (sec): 48.54 - samples/sec: 3589.57 - lr: 0.000004 - momentum: 0.000000 2023-10-17 14:21:01,573 epoch 9 - iter 712/893 - loss 0.01218580 - time (sec): 55.53 - samples/sec: 3591.70 - lr: 0.000004 - momentum: 0.000000 2023-10-17 14:21:08,429 epoch 9 - iter 801/893 - loss 0.01218316 - time (sec): 62.39 - samples/sec: 3573.44 - lr: 0.000004 - momentum: 0.000000 2023-10-17 14:21:15,572 epoch 9 - iter 890/893 - loss 0.01221895 - time (sec): 69.53 - samples/sec: 3566.01 - lr: 0.000003 - momentum: 0.000000 2023-10-17 14:21:15,802 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:21:15,802 EPOCH 9 done: loss 0.0122 - lr: 0.000003 2023-10-17 14:21:20,642 DEV : loss 0.20407408475875854 - f1-score (micro avg) 0.8209 2023-10-17 14:21:20,661 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:21:27,776 epoch 10 - iter 89/893 - loss 0.00971997 - time (sec): 7.11 - samples/sec: 3615.69 - lr: 0.000003 - momentum: 0.000000 2023-10-17 14:21:34,502 epoch 10 - iter 178/893 - loss 0.00955405 - time (sec): 13.84 - samples/sec: 3536.48 - lr: 0.000003 - momentum: 0.000000 2023-10-17 14:21:41,337 epoch 10 - iter 267/893 - loss 0.00944326 - time (sec): 20.68 - samples/sec: 3600.81 - lr: 0.000002 - momentum: 0.000000 2023-10-17 14:21:47,925 epoch 10 - iter 356/893 - loss 0.00930071 - time (sec): 27.26 - samples/sec: 3574.09 - lr: 0.000002 - momentum: 0.000000 2023-10-17 14:21:55,162 epoch 10 - iter 445/893 - loss 0.00925717 - time (sec): 34.50 - samples/sec: 3549.48 - lr: 0.000002 - momentum: 0.000000 2023-10-17 14:22:01,938 epoch 10 - iter 534/893 - loss 0.00902192 - time (sec): 41.28 - samples/sec: 3535.95 - lr: 0.000001 - momentum: 0.000000 2023-10-17 14:22:09,683 epoch 10 - iter 623/893 - loss 0.00924880 - time (sec): 49.02 - samples/sec: 3531.34 - lr: 0.000001 - momentum: 0.000000 2023-10-17 14:22:16,544 epoch 10 - iter 712/893 - loss 0.00949301 - time (sec): 55.88 - samples/sec: 3530.79 - lr: 0.000001 - momentum: 0.000000 2023-10-17 14:22:23,898 epoch 10 - iter 801/893 - loss 0.00974846 - time (sec): 63.24 - samples/sec: 3537.12 - lr: 0.000000 - momentum: 0.000000 2023-10-17 14:22:30,922 epoch 10 - iter 890/893 - loss 0.00937198 - time (sec): 70.26 - samples/sec: 3530.47 - lr: 0.000000 - momentum: 0.000000 2023-10-17 14:22:31,113 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:22:31,113 EPOCH 10 done: loss 0.0094 - lr: 0.000000 2023-10-17 14:22:35,808 DEV : loss 0.21412597596645355 - f1-score (micro avg) 0.8173 2023-10-17 14:22:36,161 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:22:36,162 Loading model from best epoch ... 2023-10-17 14:22:37,524 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 14:22:47,325 Results: - F-score (micro) 0.7081 - F-score (macro) 0.6378 - Accuracy 0.5674 By class: precision recall f1-score support LOC 0.7379 0.6941 0.7153 1095 PER 0.8155 0.7777 0.7962 1012 ORG 0.4339 0.6162 0.5093 357 HumanProd 0.4000 0.7879 0.5306 33 micro avg 0.6985 0.7181 0.7081 2497 macro avg 0.5968 0.7190 0.6378 2497 weighted avg 0.7214 0.7181 0.7162 2497 2023-10-17 14:22:47,325 ----------------------------------------------------------------------------------------------------