stefan-it's picture
Upload folder using huggingface_hub
dd58e09
2023-10-13 10:44:10,449 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:10,450 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 10:44:10,450 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:10,450 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-13 10:44:10,450 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:10,450 Train: 966 sentences
2023-10-13 10:44:10,450 (train_with_dev=False, train_with_test=False)
2023-10-13 10:44:10,450 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:10,450 Training Params:
2023-10-13 10:44:10,450 - learning_rate: "3e-05"
2023-10-13 10:44:10,450 - mini_batch_size: "8"
2023-10-13 10:44:10,450 - max_epochs: "10"
2023-10-13 10:44:10,450 - shuffle: "True"
2023-10-13 10:44:10,450 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:10,450 Plugins:
2023-10-13 10:44:10,450 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 10:44:10,451 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:10,451 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 10:44:10,451 - metric: "('micro avg', 'f1-score')"
2023-10-13 10:44:10,451 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:10,451 Computation:
2023-10-13 10:44:10,451 - compute on device: cuda:0
2023-10-13 10:44:10,451 - embedding storage: none
2023-10-13 10:44:10,451 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:10,451 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-13 10:44:10,451 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:10,451 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:11,132 epoch 1 - iter 12/121 - loss 3.51321597 - time (sec): 0.68 - samples/sec: 3523.65 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:44:11,834 epoch 1 - iter 24/121 - loss 3.29488879 - time (sec): 1.38 - samples/sec: 3581.80 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:44:12,512 epoch 1 - iter 36/121 - loss 2.99548579 - time (sec): 2.06 - samples/sec: 3394.31 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:44:13,299 epoch 1 - iter 48/121 - loss 2.41287355 - time (sec): 2.85 - samples/sec: 3445.12 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:44:14,009 epoch 1 - iter 60/121 - loss 2.04321003 - time (sec): 3.56 - samples/sec: 3458.41 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:44:14,676 epoch 1 - iter 72/121 - loss 1.79618228 - time (sec): 4.22 - samples/sec: 3498.27 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:44:15,433 epoch 1 - iter 84/121 - loss 1.62352073 - time (sec): 4.98 - samples/sec: 3452.27 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:44:16,153 epoch 1 - iter 96/121 - loss 1.48154791 - time (sec): 5.70 - samples/sec: 3470.15 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:44:16,929 epoch 1 - iter 108/121 - loss 1.36723369 - time (sec): 6.48 - samples/sec: 3430.89 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:44:17,617 epoch 1 - iter 120/121 - loss 1.27088232 - time (sec): 7.16 - samples/sec: 3446.16 - lr: 0.000030 - momentum: 0.000000
2023-10-13 10:44:17,666 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:17,666 EPOCH 1 done: loss 1.2702 - lr: 0.000030
2023-10-13 10:44:18,571 DEV : loss 0.3508635461330414 - f1-score (micro avg) 0.4129
2023-10-13 10:44:18,575 saving best model
2023-10-13 10:44:18,959 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:19,661 epoch 2 - iter 12/121 - loss 0.34872943 - time (sec): 0.70 - samples/sec: 3702.46 - lr: 0.000030 - momentum: 0.000000
2023-10-13 10:44:20,357 epoch 2 - iter 24/121 - loss 0.32940314 - time (sec): 1.40 - samples/sec: 3499.83 - lr: 0.000029 - momentum: 0.000000
2023-10-13 10:44:21,042 epoch 2 - iter 36/121 - loss 0.31951398 - time (sec): 2.08 - samples/sec: 3464.17 - lr: 0.000029 - momentum: 0.000000
2023-10-13 10:44:21,756 epoch 2 - iter 48/121 - loss 0.29887519 - time (sec): 2.80 - samples/sec: 3542.35 - lr: 0.000029 - momentum: 0.000000
2023-10-13 10:44:22,543 epoch 2 - iter 60/121 - loss 0.29679138 - time (sec): 3.58 - samples/sec: 3476.77 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:44:23,260 epoch 2 - iter 72/121 - loss 0.28969882 - time (sec): 4.30 - samples/sec: 3440.26 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:44:23,980 epoch 2 - iter 84/121 - loss 0.27870861 - time (sec): 5.02 - samples/sec: 3409.28 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:44:24,704 epoch 2 - iter 96/121 - loss 0.27071546 - time (sec): 5.74 - samples/sec: 3390.97 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:44:25,448 epoch 2 - iter 108/121 - loss 0.26477634 - time (sec): 6.49 - samples/sec: 3389.90 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:44:26,188 epoch 2 - iter 120/121 - loss 0.25613493 - time (sec): 7.23 - samples/sec: 3400.07 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:44:26,236 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:26,236 EPOCH 2 done: loss 0.2562 - lr: 0.000027
2023-10-13 10:44:26,979 DEV : loss 0.16044385731220245 - f1-score (micro avg) 0.723
2023-10-13 10:44:26,984 saving best model
2023-10-13 10:44:27,448 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:28,185 epoch 3 - iter 12/121 - loss 0.14606844 - time (sec): 0.73 - samples/sec: 3114.53 - lr: 0.000026 - momentum: 0.000000
2023-10-13 10:44:28,881 epoch 3 - iter 24/121 - loss 0.14489480 - time (sec): 1.43 - samples/sec: 3285.70 - lr: 0.000026 - momentum: 0.000000
2023-10-13 10:44:29,605 epoch 3 - iter 36/121 - loss 0.14732529 - time (sec): 2.15 - samples/sec: 3318.56 - lr: 0.000026 - momentum: 0.000000
2023-10-13 10:44:30,356 epoch 3 - iter 48/121 - loss 0.14501133 - time (sec): 2.90 - samples/sec: 3269.44 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:44:31,043 epoch 3 - iter 60/121 - loss 0.14185856 - time (sec): 3.59 - samples/sec: 3255.97 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:44:31,792 epoch 3 - iter 72/121 - loss 0.14156836 - time (sec): 4.34 - samples/sec: 3332.27 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:44:32,468 epoch 3 - iter 84/121 - loss 0.13930022 - time (sec): 5.02 - samples/sec: 3344.01 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:44:33,197 epoch 3 - iter 96/121 - loss 0.13453425 - time (sec): 5.75 - samples/sec: 3387.72 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:44:33,915 epoch 3 - iter 108/121 - loss 0.13486151 - time (sec): 6.46 - samples/sec: 3379.71 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:44:34,740 epoch 3 - iter 120/121 - loss 0.13199486 - time (sec): 7.29 - samples/sec: 3369.40 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:44:34,795 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:34,796 EPOCH 3 done: loss 0.1318 - lr: 0.000023
2023-10-13 10:44:35,547 DEV : loss 0.1352299004793167 - f1-score (micro avg) 0.7884
2023-10-13 10:44:35,552 saving best model
2023-10-13 10:44:36,012 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:36,811 epoch 4 - iter 12/121 - loss 0.07294851 - time (sec): 0.79 - samples/sec: 3234.72 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:44:37,552 epoch 4 - iter 24/121 - loss 0.07174414 - time (sec): 1.53 - samples/sec: 3466.33 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:44:38,288 epoch 4 - iter 36/121 - loss 0.08161430 - time (sec): 2.26 - samples/sec: 3388.07 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:44:38,982 epoch 4 - iter 48/121 - loss 0.08164795 - time (sec): 2.96 - samples/sec: 3429.61 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:44:39,666 epoch 4 - iter 60/121 - loss 0.08391292 - time (sec): 3.64 - samples/sec: 3433.24 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:44:40,399 epoch 4 - iter 72/121 - loss 0.08008740 - time (sec): 4.38 - samples/sec: 3375.57 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:44:41,111 epoch 4 - iter 84/121 - loss 0.08382287 - time (sec): 5.09 - samples/sec: 3391.95 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:44:41,929 epoch 4 - iter 96/121 - loss 0.08503397 - time (sec): 5.91 - samples/sec: 3365.88 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:44:42,633 epoch 4 - iter 108/121 - loss 0.08523333 - time (sec): 6.61 - samples/sec: 3356.44 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:44:43,351 epoch 4 - iter 120/121 - loss 0.08617500 - time (sec): 7.33 - samples/sec: 3356.76 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:44:43,398 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:43,398 EPOCH 4 done: loss 0.0857 - lr: 0.000020
2023-10-13 10:44:44,144 DEV : loss 0.12280040979385376 - f1-score (micro avg) 0.8232
2023-10-13 10:44:44,148 saving best model
2023-10-13 10:44:44,604 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:45,380 epoch 5 - iter 12/121 - loss 0.08088457 - time (sec): 0.77 - samples/sec: 3184.56 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:44:46,150 epoch 5 - iter 24/121 - loss 0.06668373 - time (sec): 1.54 - samples/sec: 3319.15 - lr: 0.000019 - momentum: 0.000000
2023-10-13 10:44:46,881 epoch 5 - iter 36/121 - loss 0.06423692 - time (sec): 2.27 - samples/sec: 3309.82 - lr: 0.000019 - momentum: 0.000000
2023-10-13 10:44:47,662 epoch 5 - iter 48/121 - loss 0.06070212 - time (sec): 3.05 - samples/sec: 3322.60 - lr: 0.000019 - momentum: 0.000000
2023-10-13 10:44:48,333 epoch 5 - iter 60/121 - loss 0.06194933 - time (sec): 3.72 - samples/sec: 3333.25 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:44:49,100 epoch 5 - iter 72/121 - loss 0.06304159 - time (sec): 4.49 - samples/sec: 3306.08 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:44:49,782 epoch 5 - iter 84/121 - loss 0.06360609 - time (sec): 5.17 - samples/sec: 3331.77 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:44:50,526 epoch 5 - iter 96/121 - loss 0.06367570 - time (sec): 5.91 - samples/sec: 3322.57 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:44:51,275 epoch 5 - iter 108/121 - loss 0.06269414 - time (sec): 6.66 - samples/sec: 3347.64 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:44:51,987 epoch 5 - iter 120/121 - loss 0.06168175 - time (sec): 7.38 - samples/sec: 3336.12 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:44:52,047 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:52,047 EPOCH 5 done: loss 0.0614 - lr: 0.000017
2023-10-13 10:44:52,821 DEV : loss 0.12571021914482117 - f1-score (micro avg) 0.8123
2023-10-13 10:44:52,826 ----------------------------------------------------------------------------------------------------
2023-10-13 10:44:53,502 epoch 6 - iter 12/121 - loss 0.06029715 - time (sec): 0.68 - samples/sec: 3405.79 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:44:54,238 epoch 6 - iter 24/121 - loss 0.04547772 - time (sec): 1.41 - samples/sec: 3318.54 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:44:55,072 epoch 6 - iter 36/121 - loss 0.04428078 - time (sec): 2.25 - samples/sec: 3264.89 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:44:55,796 epoch 6 - iter 48/121 - loss 0.04916090 - time (sec): 2.97 - samples/sec: 3366.87 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:44:56,487 epoch 6 - iter 60/121 - loss 0.04902978 - time (sec): 3.66 - samples/sec: 3328.14 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:44:57,276 epoch 6 - iter 72/121 - loss 0.04833321 - time (sec): 4.45 - samples/sec: 3337.54 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:44:58,011 epoch 6 - iter 84/121 - loss 0.04368248 - time (sec): 5.18 - samples/sec: 3365.14 - lr: 0.000014 - momentum: 0.000000
2023-10-13 10:44:58,714 epoch 6 - iter 96/121 - loss 0.04329465 - time (sec): 5.89 - samples/sec: 3385.53 - lr: 0.000014 - momentum: 0.000000
2023-10-13 10:44:59,407 epoch 6 - iter 108/121 - loss 0.04378744 - time (sec): 6.58 - samples/sec: 3377.78 - lr: 0.000014 - momentum: 0.000000
2023-10-13 10:45:00,098 epoch 6 - iter 120/121 - loss 0.04492980 - time (sec): 7.27 - samples/sec: 3384.12 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:45:00,147 ----------------------------------------------------------------------------------------------------
2023-10-13 10:45:00,147 EPOCH 6 done: loss 0.0447 - lr: 0.000013
2023-10-13 10:45:01,065 DEV : loss 0.14645931124687195 - f1-score (micro avg) 0.8302
2023-10-13 10:45:01,070 saving best model
2023-10-13 10:45:01,558 ----------------------------------------------------------------------------------------------------
2023-10-13 10:45:02,241 epoch 7 - iter 12/121 - loss 0.04212327 - time (sec): 0.68 - samples/sec: 3463.91 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:45:02,956 epoch 7 - iter 24/121 - loss 0.03030730 - time (sec): 1.40 - samples/sec: 3467.42 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:45:03,664 epoch 7 - iter 36/121 - loss 0.02753375 - time (sec): 2.10 - samples/sec: 3384.82 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:45:04,436 epoch 7 - iter 48/121 - loss 0.02907635 - time (sec): 2.88 - samples/sec: 3381.23 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:45:05,204 epoch 7 - iter 60/121 - loss 0.02750698 - time (sec): 3.64 - samples/sec: 3429.92 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:45:05,925 epoch 7 - iter 72/121 - loss 0.02826464 - time (sec): 4.36 - samples/sec: 3420.28 - lr: 0.000011 - momentum: 0.000000
2023-10-13 10:45:06,684 epoch 7 - iter 84/121 - loss 0.03040430 - time (sec): 5.12 - samples/sec: 3383.53 - lr: 0.000011 - momentum: 0.000000
2023-10-13 10:45:07,367 epoch 7 - iter 96/121 - loss 0.03095690 - time (sec): 5.81 - samples/sec: 3389.27 - lr: 0.000011 - momentum: 0.000000
2023-10-13 10:45:08,055 epoch 7 - iter 108/121 - loss 0.03036297 - time (sec): 6.50 - samples/sec: 3422.44 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:45:08,809 epoch 7 - iter 120/121 - loss 0.03149892 - time (sec): 7.25 - samples/sec: 3390.98 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:45:08,860 ----------------------------------------------------------------------------------------------------
2023-10-13 10:45:08,860 EPOCH 7 done: loss 0.0317 - lr: 0.000010
2023-10-13 10:45:09,616 DEV : loss 0.1589440405368805 - f1-score (micro avg) 0.8136
2023-10-13 10:45:09,621 ----------------------------------------------------------------------------------------------------
2023-10-13 10:45:10,338 epoch 8 - iter 12/121 - loss 0.03037428 - time (sec): 0.72 - samples/sec: 3324.50 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:45:11,043 epoch 8 - iter 24/121 - loss 0.03173288 - time (sec): 1.42 - samples/sec: 3154.50 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:45:11,804 epoch 8 - iter 36/121 - loss 0.02729142 - time (sec): 2.18 - samples/sec: 3174.68 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:45:12,625 epoch 8 - iter 48/121 - loss 0.02623794 - time (sec): 3.00 - samples/sec: 3228.35 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:45:13,340 epoch 8 - iter 60/121 - loss 0.02517167 - time (sec): 3.72 - samples/sec: 3266.24 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:45:14,055 epoch 8 - iter 72/121 - loss 0.02431651 - time (sec): 4.43 - samples/sec: 3309.11 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:45:14,740 epoch 8 - iter 84/121 - loss 0.02713066 - time (sec): 5.12 - samples/sec: 3316.82 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:45:15,467 epoch 8 - iter 96/121 - loss 0.02632913 - time (sec): 5.85 - samples/sec: 3286.93 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:45:16,283 epoch 8 - iter 108/121 - loss 0.02462711 - time (sec): 6.66 - samples/sec: 3302.97 - lr: 0.000007 - momentum: 0.000000
2023-10-13 10:45:17,035 epoch 8 - iter 120/121 - loss 0.02425663 - time (sec): 7.41 - samples/sec: 3319.99 - lr: 0.000007 - momentum: 0.000000
2023-10-13 10:45:17,086 ----------------------------------------------------------------------------------------------------
2023-10-13 10:45:17,086 EPOCH 8 done: loss 0.0241 - lr: 0.000007
2023-10-13 10:45:17,843 DEV : loss 0.16912458837032318 - f1-score (micro avg) 0.8396
2023-10-13 10:45:17,848 saving best model
2023-10-13 10:45:18,331 ----------------------------------------------------------------------------------------------------
2023-10-13 10:45:19,090 epoch 9 - iter 12/121 - loss 0.01197683 - time (sec): 0.75 - samples/sec: 3262.19 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:45:19,850 epoch 9 - iter 24/121 - loss 0.01853147 - time (sec): 1.51 - samples/sec: 3248.80 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:45:20,652 epoch 9 - iter 36/121 - loss 0.01813611 - time (sec): 2.31 - samples/sec: 3259.97 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:45:21,412 epoch 9 - iter 48/121 - loss 0.01789886 - time (sec): 3.07 - samples/sec: 3326.63 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:45:22,117 epoch 9 - iter 60/121 - loss 0.01892584 - time (sec): 3.77 - samples/sec: 3312.04 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:45:22,843 epoch 9 - iter 72/121 - loss 0.01887653 - time (sec): 4.50 - samples/sec: 3300.25 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:45:23,579 epoch 9 - iter 84/121 - loss 0.01813914 - time (sec): 5.24 - samples/sec: 3330.45 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:45:24,347 epoch 9 - iter 96/121 - loss 0.02178354 - time (sec): 6.00 - samples/sec: 3318.56 - lr: 0.000004 - momentum: 0.000000
2023-10-13 10:45:25,045 epoch 9 - iter 108/121 - loss 0.02094768 - time (sec): 6.70 - samples/sec: 3331.57 - lr: 0.000004 - momentum: 0.000000
2023-10-13 10:45:25,797 epoch 9 - iter 120/121 - loss 0.01977182 - time (sec): 7.45 - samples/sec: 3306.07 - lr: 0.000004 - momentum: 0.000000
2023-10-13 10:45:25,842 ----------------------------------------------------------------------------------------------------
2023-10-13 10:45:25,843 EPOCH 9 done: loss 0.0197 - lr: 0.000004
2023-10-13 10:45:26,597 DEV : loss 0.1732056736946106 - f1-score (micro avg) 0.8407
2023-10-13 10:45:26,603 saving best model
2023-10-13 10:45:27,097 ----------------------------------------------------------------------------------------------------
2023-10-13 10:45:27,828 epoch 10 - iter 12/121 - loss 0.02930384 - time (sec): 0.72 - samples/sec: 3381.93 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:45:28,620 epoch 10 - iter 24/121 - loss 0.01725666 - time (sec): 1.51 - samples/sec: 3402.96 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:45:29,395 epoch 10 - iter 36/121 - loss 0.01442001 - time (sec): 2.29 - samples/sec: 3239.62 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:45:30,116 epoch 10 - iter 48/121 - loss 0.01419669 - time (sec): 3.01 - samples/sec: 3290.84 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:45:30,918 epoch 10 - iter 60/121 - loss 0.01458065 - time (sec): 3.81 - samples/sec: 3246.16 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:45:31,687 epoch 10 - iter 72/121 - loss 0.01542659 - time (sec): 4.58 - samples/sec: 3255.17 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:45:32,415 epoch 10 - iter 84/121 - loss 0.01524820 - time (sec): 5.31 - samples/sec: 3242.25 - lr: 0.000001 - momentum: 0.000000
2023-10-13 10:45:33,159 epoch 10 - iter 96/121 - loss 0.01681722 - time (sec): 6.05 - samples/sec: 3244.42 - lr: 0.000001 - momentum: 0.000000
2023-10-13 10:45:33,897 epoch 10 - iter 108/121 - loss 0.01701795 - time (sec): 6.79 - samples/sec: 3232.95 - lr: 0.000001 - momentum: 0.000000
2023-10-13 10:45:34,692 epoch 10 - iter 120/121 - loss 0.01627520 - time (sec): 7.59 - samples/sec: 3244.26 - lr: 0.000000 - momentum: 0.000000
2023-10-13 10:45:34,742 ----------------------------------------------------------------------------------------------------
2023-10-13 10:45:34,742 EPOCH 10 done: loss 0.0162 - lr: 0.000000
2023-10-13 10:45:35,596 DEV : loss 0.17861199378967285 - f1-score (micro avg) 0.8354
2023-10-13 10:45:35,974 ----------------------------------------------------------------------------------------------------
2023-10-13 10:45:35,976 Loading model from best epoch ...
2023-10-13 10:45:37,432 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-13 10:45:38,311
Results:
- F-score (micro) 0.8125
- F-score (macro) 0.5575
- Accuracy 0.7035
By class:
precision recall f1-score support
pers 0.8243 0.8777 0.8502 139
scope 0.8456 0.8915 0.8679 129
work 0.6667 0.7500 0.7059 80
loc 1.0000 0.2222 0.3636 9
date 0.0000 0.0000 0.0000 3
micro avg 0.7952 0.8306 0.8125 360
macro avg 0.6673 0.5483 0.5575 360
weighted avg 0.7944 0.8306 0.8052 360
2023-10-13 10:45:38,312 ----------------------------------------------------------------------------------------------------