stefan-it's picture
Upload folder using huggingface_hub
b793eb2
2023-10-16 14:26:00,861 ----------------------------------------------------------------------------------------------------
2023-10-16 14:26:00,862 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 14:26:00,862 ----------------------------------------------------------------------------------------------------
2023-10-16 14:26:00,862 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-16 14:26:00,862 ----------------------------------------------------------------------------------------------------
2023-10-16 14:26:00,862 Train: 7142 sentences
2023-10-16 14:26:00,862 (train_with_dev=False, train_with_test=False)
2023-10-16 14:26:00,862 ----------------------------------------------------------------------------------------------------
2023-10-16 14:26:00,863 Training Params:
2023-10-16 14:26:00,863 - learning_rate: "5e-05"
2023-10-16 14:26:00,863 - mini_batch_size: "4"
2023-10-16 14:26:00,863 - max_epochs: "10"
2023-10-16 14:26:00,863 - shuffle: "True"
2023-10-16 14:26:00,863 ----------------------------------------------------------------------------------------------------
2023-10-16 14:26:00,863 Plugins:
2023-10-16 14:26:00,863 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 14:26:00,863 ----------------------------------------------------------------------------------------------------
2023-10-16 14:26:00,863 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 14:26:00,863 - metric: "('micro avg', 'f1-score')"
2023-10-16 14:26:00,863 ----------------------------------------------------------------------------------------------------
2023-10-16 14:26:00,863 Computation:
2023-10-16 14:26:00,863 - compute on device: cuda:0
2023-10-16 14:26:00,863 - embedding storage: none
2023-10-16 14:26:00,863 ----------------------------------------------------------------------------------------------------
2023-10-16 14:26:00,863 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-16 14:26:00,863 ----------------------------------------------------------------------------------------------------
2023-10-16 14:26:00,863 ----------------------------------------------------------------------------------------------------
2023-10-16 14:26:09,399 epoch 1 - iter 178/1786 - loss 1.72350929 - time (sec): 8.54 - samples/sec: 2893.61 - lr: 0.000005 - momentum: 0.000000
2023-10-16 14:26:18,210 epoch 1 - iter 356/1786 - loss 1.08436518 - time (sec): 17.35 - samples/sec: 2864.09 - lr: 0.000010 - momentum: 0.000000
2023-10-16 14:26:26,893 epoch 1 - iter 534/1786 - loss 0.82712712 - time (sec): 26.03 - samples/sec: 2889.48 - lr: 0.000015 - momentum: 0.000000
2023-10-16 14:26:35,720 epoch 1 - iter 712/1786 - loss 0.67539723 - time (sec): 34.86 - samples/sec: 2901.75 - lr: 0.000020 - momentum: 0.000000
2023-10-16 14:26:44,292 epoch 1 - iter 890/1786 - loss 0.58731779 - time (sec): 43.43 - samples/sec: 2887.71 - lr: 0.000025 - momentum: 0.000000
2023-10-16 14:26:52,634 epoch 1 - iter 1068/1786 - loss 0.52122298 - time (sec): 51.77 - samples/sec: 2875.76 - lr: 0.000030 - momentum: 0.000000
2023-10-16 14:27:01,396 epoch 1 - iter 1246/1786 - loss 0.46746722 - time (sec): 60.53 - samples/sec: 2891.65 - lr: 0.000035 - momentum: 0.000000
2023-10-16 14:27:09,876 epoch 1 - iter 1424/1786 - loss 0.43349886 - time (sec): 69.01 - samples/sec: 2898.54 - lr: 0.000040 - momentum: 0.000000
2023-10-16 14:27:18,516 epoch 1 - iter 1602/1786 - loss 0.40309372 - time (sec): 77.65 - samples/sec: 2881.00 - lr: 0.000045 - momentum: 0.000000
2023-10-16 14:27:27,304 epoch 1 - iter 1780/1786 - loss 0.37831641 - time (sec): 86.44 - samples/sec: 2868.61 - lr: 0.000050 - momentum: 0.000000
2023-10-16 14:27:27,579 ----------------------------------------------------------------------------------------------------
2023-10-16 14:27:27,579 EPOCH 1 done: loss 0.3776 - lr: 0.000050
2023-10-16 14:27:30,550 DEV : loss 0.1278153508901596 - f1-score (micro avg) 0.6762
2023-10-16 14:27:30,565 saving best model
2023-10-16 14:27:30,922 ----------------------------------------------------------------------------------------------------
2023-10-16 14:27:39,497 epoch 2 - iter 178/1786 - loss 0.14556974 - time (sec): 8.57 - samples/sec: 2912.61 - lr: 0.000049 - momentum: 0.000000
2023-10-16 14:27:48,242 epoch 2 - iter 356/1786 - loss 0.13073050 - time (sec): 17.32 - samples/sec: 2972.66 - lr: 0.000049 - momentum: 0.000000
2023-10-16 14:27:56,928 epoch 2 - iter 534/1786 - loss 0.13025242 - time (sec): 26.00 - samples/sec: 2967.14 - lr: 0.000048 - momentum: 0.000000
2023-10-16 14:28:05,608 epoch 2 - iter 712/1786 - loss 0.13292229 - time (sec): 34.68 - samples/sec: 2921.00 - lr: 0.000048 - momentum: 0.000000
2023-10-16 14:28:14,449 epoch 2 - iter 890/1786 - loss 0.13208372 - time (sec): 43.53 - samples/sec: 2900.44 - lr: 0.000047 - momentum: 0.000000
2023-10-16 14:28:23,136 epoch 2 - iter 1068/1786 - loss 0.13177483 - time (sec): 52.21 - samples/sec: 2897.01 - lr: 0.000047 - momentum: 0.000000
2023-10-16 14:28:31,937 epoch 2 - iter 1246/1786 - loss 0.12936425 - time (sec): 61.01 - samples/sec: 2875.81 - lr: 0.000046 - momentum: 0.000000
2023-10-16 14:28:40,602 epoch 2 - iter 1424/1786 - loss 0.12693416 - time (sec): 69.68 - samples/sec: 2874.59 - lr: 0.000046 - momentum: 0.000000
2023-10-16 14:28:49,147 epoch 2 - iter 1602/1786 - loss 0.12465130 - time (sec): 78.22 - samples/sec: 2883.32 - lr: 0.000045 - momentum: 0.000000
2023-10-16 14:28:57,347 epoch 2 - iter 1780/1786 - loss 0.12599998 - time (sec): 86.42 - samples/sec: 2865.96 - lr: 0.000044 - momentum: 0.000000
2023-10-16 14:28:57,611 ----------------------------------------------------------------------------------------------------
2023-10-16 14:28:57,611 EPOCH 2 done: loss 0.1259 - lr: 0.000044
2023-10-16 14:29:02,237 DEV : loss 0.1384890377521515 - f1-score (micro avg) 0.7327
2023-10-16 14:29:02,255 saving best model
2023-10-16 14:29:02,722 ----------------------------------------------------------------------------------------------------
2023-10-16 14:29:11,369 epoch 3 - iter 178/1786 - loss 0.08576002 - time (sec): 8.64 - samples/sec: 2753.36 - lr: 0.000044 - momentum: 0.000000
2023-10-16 14:29:20,279 epoch 3 - iter 356/1786 - loss 0.09184061 - time (sec): 17.55 - samples/sec: 2740.32 - lr: 0.000043 - momentum: 0.000000
2023-10-16 14:29:29,385 epoch 3 - iter 534/1786 - loss 0.09010177 - time (sec): 26.66 - samples/sec: 2727.10 - lr: 0.000043 - momentum: 0.000000
2023-10-16 14:29:38,115 epoch 3 - iter 712/1786 - loss 0.08566478 - time (sec): 35.39 - samples/sec: 2755.05 - lr: 0.000042 - momentum: 0.000000
2023-10-16 14:29:46,981 epoch 3 - iter 890/1786 - loss 0.08411972 - time (sec): 44.26 - samples/sec: 2771.07 - lr: 0.000042 - momentum: 0.000000
2023-10-16 14:29:55,567 epoch 3 - iter 1068/1786 - loss 0.08500553 - time (sec): 52.84 - samples/sec: 2780.96 - lr: 0.000041 - momentum: 0.000000
2023-10-16 14:30:04,319 epoch 3 - iter 1246/1786 - loss 0.08416921 - time (sec): 61.59 - samples/sec: 2801.64 - lr: 0.000041 - momentum: 0.000000
2023-10-16 14:30:13,120 epoch 3 - iter 1424/1786 - loss 0.08421054 - time (sec): 70.40 - samples/sec: 2818.47 - lr: 0.000040 - momentum: 0.000000
2023-10-16 14:30:22,250 epoch 3 - iter 1602/1786 - loss 0.08757976 - time (sec): 79.52 - samples/sec: 2800.20 - lr: 0.000039 - momentum: 0.000000
2023-10-16 14:30:31,006 epoch 3 - iter 1780/1786 - loss 0.08731846 - time (sec): 88.28 - samples/sec: 2807.01 - lr: 0.000039 - momentum: 0.000000
2023-10-16 14:30:31,300 ----------------------------------------------------------------------------------------------------
2023-10-16 14:30:31,300 EPOCH 3 done: loss 0.0873 - lr: 0.000039
2023-10-16 14:30:36,115 DEV : loss 0.1713087409734726 - f1-score (micro avg) 0.7429
2023-10-16 14:30:36,132 saving best model
2023-10-16 14:30:36,584 ----------------------------------------------------------------------------------------------------
2023-10-16 14:30:45,423 epoch 4 - iter 178/1786 - loss 0.05739948 - time (sec): 8.84 - samples/sec: 2942.92 - lr: 0.000038 - momentum: 0.000000
2023-10-16 14:30:54,134 epoch 4 - iter 356/1786 - loss 0.05690690 - time (sec): 17.55 - samples/sec: 2888.92 - lr: 0.000038 - momentum: 0.000000
2023-10-16 14:31:02,732 epoch 4 - iter 534/1786 - loss 0.05637698 - time (sec): 26.14 - samples/sec: 2869.13 - lr: 0.000037 - momentum: 0.000000
2023-10-16 14:31:11,563 epoch 4 - iter 712/1786 - loss 0.05914208 - time (sec): 34.98 - samples/sec: 2868.87 - lr: 0.000037 - momentum: 0.000000
2023-10-16 14:31:20,009 epoch 4 - iter 890/1786 - loss 0.05997229 - time (sec): 43.42 - samples/sec: 2864.77 - lr: 0.000036 - momentum: 0.000000
2023-10-16 14:31:28,634 epoch 4 - iter 1068/1786 - loss 0.06052032 - time (sec): 52.05 - samples/sec: 2871.82 - lr: 0.000036 - momentum: 0.000000
2023-10-16 14:31:37,097 epoch 4 - iter 1246/1786 - loss 0.06191085 - time (sec): 60.51 - samples/sec: 2851.50 - lr: 0.000035 - momentum: 0.000000
2023-10-16 14:31:45,688 epoch 4 - iter 1424/1786 - loss 0.06222248 - time (sec): 69.10 - samples/sec: 2853.96 - lr: 0.000034 - momentum: 0.000000
2023-10-16 14:31:54,395 epoch 4 - iter 1602/1786 - loss 0.06351783 - time (sec): 77.81 - samples/sec: 2852.68 - lr: 0.000034 - momentum: 0.000000
2023-10-16 14:32:03,271 epoch 4 - iter 1780/1786 - loss 0.06440999 - time (sec): 86.68 - samples/sec: 2860.50 - lr: 0.000033 - momentum: 0.000000
2023-10-16 14:32:03,570 ----------------------------------------------------------------------------------------------------
2023-10-16 14:32:03,570 EPOCH 4 done: loss 0.0644 - lr: 0.000033
2023-10-16 14:32:07,654 DEV : loss 0.1713092029094696 - f1-score (micro avg) 0.7712
2023-10-16 14:32:07,670 saving best model
2023-10-16 14:32:08,139 ----------------------------------------------------------------------------------------------------
2023-10-16 14:32:16,968 epoch 5 - iter 178/1786 - loss 0.03588763 - time (sec): 8.83 - samples/sec: 2627.03 - lr: 0.000033 - momentum: 0.000000
2023-10-16 14:32:25,763 epoch 5 - iter 356/1786 - loss 0.04421636 - time (sec): 17.62 - samples/sec: 2862.46 - lr: 0.000032 - momentum: 0.000000
2023-10-16 14:32:34,456 epoch 5 - iter 534/1786 - loss 0.04709247 - time (sec): 26.31 - samples/sec: 2875.81 - lr: 0.000032 - momentum: 0.000000
2023-10-16 14:32:43,150 epoch 5 - iter 712/1786 - loss 0.04802956 - time (sec): 35.01 - samples/sec: 2881.20 - lr: 0.000031 - momentum: 0.000000
2023-10-16 14:32:51,542 epoch 5 - iter 890/1786 - loss 0.04796144 - time (sec): 43.40 - samples/sec: 2841.11 - lr: 0.000031 - momentum: 0.000000
2023-10-16 14:33:00,455 epoch 5 - iter 1068/1786 - loss 0.04753006 - time (sec): 52.31 - samples/sec: 2868.11 - lr: 0.000030 - momentum: 0.000000
2023-10-16 14:33:09,033 epoch 5 - iter 1246/1786 - loss 0.04905045 - time (sec): 60.89 - samples/sec: 2863.09 - lr: 0.000029 - momentum: 0.000000
2023-10-16 14:33:17,625 epoch 5 - iter 1424/1786 - loss 0.04974146 - time (sec): 69.48 - samples/sec: 2854.21 - lr: 0.000029 - momentum: 0.000000
2023-10-16 14:33:26,438 epoch 5 - iter 1602/1786 - loss 0.04973847 - time (sec): 78.30 - samples/sec: 2845.43 - lr: 0.000028 - momentum: 0.000000
2023-10-16 14:33:35,208 epoch 5 - iter 1780/1786 - loss 0.04912550 - time (sec): 87.07 - samples/sec: 2851.07 - lr: 0.000028 - momentum: 0.000000
2023-10-16 14:33:35,495 ----------------------------------------------------------------------------------------------------
2023-10-16 14:33:35,495 EPOCH 5 done: loss 0.0492 - lr: 0.000028
2023-10-16 14:33:40,088 DEV : loss 0.1638714075088501 - f1-score (micro avg) 0.786
2023-10-16 14:33:40,104 saving best model
2023-10-16 14:33:40,563 ----------------------------------------------------------------------------------------------------
2023-10-16 14:33:49,049 epoch 6 - iter 178/1786 - loss 0.04047784 - time (sec): 8.48 - samples/sec: 2804.18 - lr: 0.000027 - momentum: 0.000000
2023-10-16 14:33:57,790 epoch 6 - iter 356/1786 - loss 0.03466111 - time (sec): 17.22 - samples/sec: 2894.43 - lr: 0.000027 - momentum: 0.000000
2023-10-16 14:34:06,507 epoch 6 - iter 534/1786 - loss 0.03645853 - time (sec): 25.94 - samples/sec: 2876.13 - lr: 0.000026 - momentum: 0.000000
2023-10-16 14:34:15,323 epoch 6 - iter 712/1786 - loss 0.03843536 - time (sec): 34.76 - samples/sec: 2876.10 - lr: 0.000026 - momentum: 0.000000
2023-10-16 14:34:23,863 epoch 6 - iter 890/1786 - loss 0.04112112 - time (sec): 43.30 - samples/sec: 2866.24 - lr: 0.000025 - momentum: 0.000000
2023-10-16 14:34:32,461 epoch 6 - iter 1068/1786 - loss 0.04047041 - time (sec): 51.89 - samples/sec: 2912.93 - lr: 0.000024 - momentum: 0.000000
2023-10-16 14:34:40,749 epoch 6 - iter 1246/1786 - loss 0.04134837 - time (sec): 60.18 - samples/sec: 2923.91 - lr: 0.000024 - momentum: 0.000000
2023-10-16 14:34:49,172 epoch 6 - iter 1424/1786 - loss 0.04218921 - time (sec): 68.61 - samples/sec: 2929.62 - lr: 0.000023 - momentum: 0.000000
2023-10-16 14:34:57,673 epoch 6 - iter 1602/1786 - loss 0.04256479 - time (sec): 77.11 - samples/sec: 2907.31 - lr: 0.000023 - momentum: 0.000000
2023-10-16 14:35:06,362 epoch 6 - iter 1780/1786 - loss 0.04276236 - time (sec): 85.80 - samples/sec: 2893.03 - lr: 0.000022 - momentum: 0.000000
2023-10-16 14:35:06,640 ----------------------------------------------------------------------------------------------------
2023-10-16 14:35:06,640 EPOCH 6 done: loss 0.0429 - lr: 0.000022
2023-10-16 14:35:11,228 DEV : loss 0.16154661774635315 - f1-score (micro avg) 0.7973
2023-10-16 14:35:11,244 saving best model
2023-10-16 14:35:11,739 ----------------------------------------------------------------------------------------------------
2023-10-16 14:35:20,356 epoch 7 - iter 178/1786 - loss 0.03094916 - time (sec): 8.61 - samples/sec: 2725.79 - lr: 0.000022 - momentum: 0.000000
2023-10-16 14:35:28,854 epoch 7 - iter 356/1786 - loss 0.03031893 - time (sec): 17.11 - samples/sec: 2747.67 - lr: 0.000021 - momentum: 0.000000
2023-10-16 14:35:37,566 epoch 7 - iter 534/1786 - loss 0.03010387 - time (sec): 25.82 - samples/sec: 2799.25 - lr: 0.000021 - momentum: 0.000000
2023-10-16 14:35:46,181 epoch 7 - iter 712/1786 - loss 0.03065030 - time (sec): 34.44 - samples/sec: 2835.81 - lr: 0.000020 - momentum: 0.000000
2023-10-16 14:35:54,743 epoch 7 - iter 890/1786 - loss 0.02853794 - time (sec): 43.00 - samples/sec: 2820.94 - lr: 0.000019 - momentum: 0.000000
2023-10-16 14:36:03,394 epoch 7 - iter 1068/1786 - loss 0.02784728 - time (sec): 51.65 - samples/sec: 2829.24 - lr: 0.000019 - momentum: 0.000000
2023-10-16 14:36:12,171 epoch 7 - iter 1246/1786 - loss 0.02834536 - time (sec): 60.43 - samples/sec: 2854.33 - lr: 0.000018 - momentum: 0.000000
2023-10-16 14:36:21,073 epoch 7 - iter 1424/1786 - loss 0.02869632 - time (sec): 69.33 - samples/sec: 2875.27 - lr: 0.000018 - momentum: 0.000000
2023-10-16 14:36:29,562 epoch 7 - iter 1602/1786 - loss 0.02920281 - time (sec): 77.82 - samples/sec: 2867.77 - lr: 0.000017 - momentum: 0.000000
2023-10-16 14:36:38,162 epoch 7 - iter 1780/1786 - loss 0.03013120 - time (sec): 86.42 - samples/sec: 2870.16 - lr: 0.000017 - momentum: 0.000000
2023-10-16 14:36:38,448 ----------------------------------------------------------------------------------------------------
2023-10-16 14:36:38,449 EPOCH 7 done: loss 0.0301 - lr: 0.000017
2023-10-16 14:36:42,497 DEV : loss 0.18691599369049072 - f1-score (micro avg) 0.8076
2023-10-16 14:36:42,513 saving best model
2023-10-16 14:36:42,974 ----------------------------------------------------------------------------------------------------
2023-10-16 14:36:51,660 epoch 8 - iter 178/1786 - loss 0.02062719 - time (sec): 8.68 - samples/sec: 2748.88 - lr: 0.000016 - momentum: 0.000000
2023-10-16 14:37:00,179 epoch 8 - iter 356/1786 - loss 0.02050486 - time (sec): 17.20 - samples/sec: 2833.37 - lr: 0.000016 - momentum: 0.000000
2023-10-16 14:37:08,847 epoch 8 - iter 534/1786 - loss 0.02086256 - time (sec): 25.87 - samples/sec: 2853.42 - lr: 0.000015 - momentum: 0.000000
2023-10-16 14:37:17,490 epoch 8 - iter 712/1786 - loss 0.02131782 - time (sec): 34.51 - samples/sec: 2873.72 - lr: 0.000014 - momentum: 0.000000
2023-10-16 14:37:25,774 epoch 8 - iter 890/1786 - loss 0.02156856 - time (sec): 42.80 - samples/sec: 2852.53 - lr: 0.000014 - momentum: 0.000000
2023-10-16 14:37:34,448 epoch 8 - iter 1068/1786 - loss 0.02181934 - time (sec): 51.47 - samples/sec: 2841.35 - lr: 0.000013 - momentum: 0.000000
2023-10-16 14:37:43,284 epoch 8 - iter 1246/1786 - loss 0.02187373 - time (sec): 60.31 - samples/sec: 2873.93 - lr: 0.000013 - momentum: 0.000000
2023-10-16 14:37:52,258 epoch 8 - iter 1424/1786 - loss 0.02158534 - time (sec): 69.28 - samples/sec: 2876.68 - lr: 0.000012 - momentum: 0.000000
2023-10-16 14:38:00,408 epoch 8 - iter 1602/1786 - loss 0.02156756 - time (sec): 77.43 - samples/sec: 2868.83 - lr: 0.000012 - momentum: 0.000000
2023-10-16 14:38:08,929 epoch 8 - iter 1780/1786 - loss 0.02125784 - time (sec): 85.95 - samples/sec: 2880.58 - lr: 0.000011 - momentum: 0.000000
2023-10-16 14:38:09,269 ----------------------------------------------------------------------------------------------------
2023-10-16 14:38:09,269 EPOCH 8 done: loss 0.0212 - lr: 0.000011
2023-10-16 14:38:13,927 DEV : loss 0.19830918312072754 - f1-score (micro avg) 0.8
2023-10-16 14:38:13,943 ----------------------------------------------------------------------------------------------------
2023-10-16 14:38:22,614 epoch 9 - iter 178/1786 - loss 0.01444712 - time (sec): 8.67 - samples/sec: 2747.32 - lr: 0.000011 - momentum: 0.000000
2023-10-16 14:38:31,308 epoch 9 - iter 356/1786 - loss 0.01386542 - time (sec): 17.36 - samples/sec: 2790.61 - lr: 0.000010 - momentum: 0.000000
2023-10-16 14:38:39,923 epoch 9 - iter 534/1786 - loss 0.01456268 - time (sec): 25.98 - samples/sec: 2781.99 - lr: 0.000009 - momentum: 0.000000
2023-10-16 14:38:48,588 epoch 9 - iter 712/1786 - loss 0.01617640 - time (sec): 34.64 - samples/sec: 2799.31 - lr: 0.000009 - momentum: 0.000000
2023-10-16 14:38:57,041 epoch 9 - iter 890/1786 - loss 0.01623150 - time (sec): 43.10 - samples/sec: 2795.59 - lr: 0.000008 - momentum: 0.000000
2023-10-16 14:39:05,722 epoch 9 - iter 1068/1786 - loss 0.01612223 - time (sec): 51.78 - samples/sec: 2819.35 - lr: 0.000008 - momentum: 0.000000
2023-10-16 14:39:14,680 epoch 9 - iter 1246/1786 - loss 0.01568525 - time (sec): 60.74 - samples/sec: 2811.42 - lr: 0.000007 - momentum: 0.000000
2023-10-16 14:39:23,598 epoch 9 - iter 1424/1786 - loss 0.01603606 - time (sec): 69.65 - samples/sec: 2832.39 - lr: 0.000007 - momentum: 0.000000
2023-10-16 14:39:32,277 epoch 9 - iter 1602/1786 - loss 0.01544125 - time (sec): 78.33 - samples/sec: 2846.53 - lr: 0.000006 - momentum: 0.000000
2023-10-16 14:39:40,960 epoch 9 - iter 1780/1786 - loss 0.01507366 - time (sec): 87.02 - samples/sec: 2849.40 - lr: 0.000006 - momentum: 0.000000
2023-10-16 14:39:41,259 ----------------------------------------------------------------------------------------------------
2023-10-16 14:39:41,260 EPOCH 9 done: loss 0.0151 - lr: 0.000006
2023-10-16 14:39:45,962 DEV : loss 0.20294949412345886 - f1-score (micro avg) 0.8003
2023-10-16 14:39:45,978 ----------------------------------------------------------------------------------------------------
2023-10-16 14:39:54,760 epoch 10 - iter 178/1786 - loss 0.01198823 - time (sec): 8.78 - samples/sec: 2808.38 - lr: 0.000005 - momentum: 0.000000
2023-10-16 14:40:03,269 epoch 10 - iter 356/1786 - loss 0.01082116 - time (sec): 17.29 - samples/sec: 2772.31 - lr: 0.000004 - momentum: 0.000000
2023-10-16 14:40:11,969 epoch 10 - iter 534/1786 - loss 0.00989378 - time (sec): 25.99 - samples/sec: 2800.29 - lr: 0.000004 - momentum: 0.000000
2023-10-16 14:40:20,848 epoch 10 - iter 712/1786 - loss 0.01074502 - time (sec): 34.87 - samples/sec: 2810.52 - lr: 0.000003 - momentum: 0.000000
2023-10-16 14:40:29,477 epoch 10 - iter 890/1786 - loss 0.00999905 - time (sec): 43.50 - samples/sec: 2816.99 - lr: 0.000003 - momentum: 0.000000
2023-10-16 14:40:38,192 epoch 10 - iter 1068/1786 - loss 0.00997578 - time (sec): 52.21 - samples/sec: 2819.62 - lr: 0.000002 - momentum: 0.000000
2023-10-16 14:40:46,875 epoch 10 - iter 1246/1786 - loss 0.00978065 - time (sec): 60.90 - samples/sec: 2821.42 - lr: 0.000002 - momentum: 0.000000
2023-10-16 14:40:55,564 epoch 10 - iter 1424/1786 - loss 0.00992214 - time (sec): 69.58 - samples/sec: 2829.76 - lr: 0.000001 - momentum: 0.000000
2023-10-16 14:41:04,111 epoch 10 - iter 1602/1786 - loss 0.01021399 - time (sec): 78.13 - samples/sec: 2835.76 - lr: 0.000001 - momentum: 0.000000
2023-10-16 14:41:12,876 epoch 10 - iter 1780/1786 - loss 0.01056216 - time (sec): 86.90 - samples/sec: 2851.53 - lr: 0.000000 - momentum: 0.000000
2023-10-16 14:41:13,167 ----------------------------------------------------------------------------------------------------
2023-10-16 14:41:13,167 EPOCH 10 done: loss 0.0105 - lr: 0.000000
2023-10-16 14:41:17,252 DEV : loss 0.20407141745090485 - f1-score (micro avg) 0.8005
2023-10-16 14:41:17,631 ----------------------------------------------------------------------------------------------------
2023-10-16 14:41:17,632 Loading model from best epoch ...
2023-10-16 14:41:19,064 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-16 14:41:28,416
Results:
- F-score (micro) 0.6717
- F-score (macro) 0.5958
- Accuracy 0.5213
By class:
precision recall f1-score support
LOC 0.6643 0.6721 0.6682 1095
PER 0.7561 0.7292 0.7425 1012
ORG 0.5041 0.5182 0.5110 357
HumanProd 0.4000 0.5455 0.4615 33
micro avg 0.6719 0.6716 0.6717 2497
macro avg 0.5811 0.6163 0.5958 2497
weighted avg 0.6751 0.6716 0.6731 2497
2023-10-16 14:41:28,416 ----------------------------------------------------------------------------------------------------