stefan-it's picture
Upload folder using huggingface_hub
db768bb
2023-10-14 01:17:01,167 ----------------------------------------------------------------------------------------------------
2023-10-14 01:17:01,168 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 01:17:01,168 ----------------------------------------------------------------------------------------------------
2023-10-14 01:17:01,168 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-14 01:17:01,168 ----------------------------------------------------------------------------------------------------
2023-10-14 01:17:01,168 Train: 7936 sentences
2023-10-14 01:17:01,168 (train_with_dev=False, train_with_test=False)
2023-10-14 01:17:01,168 ----------------------------------------------------------------------------------------------------
2023-10-14 01:17:01,168 Training Params:
2023-10-14 01:17:01,168 - learning_rate: "5e-05"
2023-10-14 01:17:01,168 - mini_batch_size: "8"
2023-10-14 01:17:01,168 - max_epochs: "10"
2023-10-14 01:17:01,168 - shuffle: "True"
2023-10-14 01:17:01,168 ----------------------------------------------------------------------------------------------------
2023-10-14 01:17:01,168 Plugins:
2023-10-14 01:17:01,168 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 01:17:01,169 ----------------------------------------------------------------------------------------------------
2023-10-14 01:17:01,169 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 01:17:01,169 - metric: "('micro avg', 'f1-score')"
2023-10-14 01:17:01,169 ----------------------------------------------------------------------------------------------------
2023-10-14 01:17:01,169 Computation:
2023-10-14 01:17:01,169 - compute on device: cuda:0
2023-10-14 01:17:01,169 - embedding storage: none
2023-10-14 01:17:01,169 ----------------------------------------------------------------------------------------------------
2023-10-14 01:17:01,169 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-14 01:17:01,169 ----------------------------------------------------------------------------------------------------
2023-10-14 01:17:01,169 ----------------------------------------------------------------------------------------------------
2023-10-14 01:17:06,714 epoch 1 - iter 99/992 - loss 1.86588415 - time (sec): 5.54 - samples/sec: 2785.36 - lr: 0.000005 - momentum: 0.000000
2023-10-14 01:17:12,449 epoch 1 - iter 198/992 - loss 1.09873625 - time (sec): 11.28 - samples/sec: 2794.75 - lr: 0.000010 - momentum: 0.000000
2023-10-14 01:17:18,387 epoch 1 - iter 297/992 - loss 0.80249953 - time (sec): 17.22 - samples/sec: 2794.10 - lr: 0.000015 - momentum: 0.000000
2023-10-14 01:17:23,944 epoch 1 - iter 396/992 - loss 0.64530064 - time (sec): 22.77 - samples/sec: 2826.21 - lr: 0.000020 - momentum: 0.000000
2023-10-14 01:17:29,847 epoch 1 - iter 495/992 - loss 0.54951004 - time (sec): 28.68 - samples/sec: 2818.40 - lr: 0.000025 - momentum: 0.000000
2023-10-14 01:17:35,860 epoch 1 - iter 594/992 - loss 0.47822528 - time (sec): 34.69 - samples/sec: 2821.08 - lr: 0.000030 - momentum: 0.000000
2023-10-14 01:17:41,757 epoch 1 - iter 693/992 - loss 0.43156759 - time (sec): 40.59 - samples/sec: 2802.37 - lr: 0.000035 - momentum: 0.000000
2023-10-14 01:17:47,720 epoch 1 - iter 792/992 - loss 0.39387496 - time (sec): 46.55 - samples/sec: 2794.11 - lr: 0.000040 - momentum: 0.000000
2023-10-14 01:17:53,595 epoch 1 - iter 891/992 - loss 0.36458290 - time (sec): 52.42 - samples/sec: 2790.26 - lr: 0.000045 - momentum: 0.000000
2023-10-14 01:17:59,691 epoch 1 - iter 990/992 - loss 0.34069275 - time (sec): 58.52 - samples/sec: 2792.26 - lr: 0.000050 - momentum: 0.000000
2023-10-14 01:17:59,900 ----------------------------------------------------------------------------------------------------
2023-10-14 01:17:59,900 EPOCH 1 done: loss 0.3399 - lr: 0.000050
2023-10-14 01:18:03,409 DEV : loss 0.09731486439704895 - f1-score (micro avg) 0.6696
2023-10-14 01:18:03,433 saving best model
2023-10-14 01:18:03,828 ----------------------------------------------------------------------------------------------------
2023-10-14 01:18:09,534 epoch 2 - iter 99/992 - loss 0.12915040 - time (sec): 5.70 - samples/sec: 2665.74 - lr: 0.000049 - momentum: 0.000000
2023-10-14 01:18:15,365 epoch 2 - iter 198/992 - loss 0.11559230 - time (sec): 11.54 - samples/sec: 2704.02 - lr: 0.000049 - momentum: 0.000000
2023-10-14 01:18:20,960 epoch 2 - iter 297/992 - loss 0.11414920 - time (sec): 17.13 - samples/sec: 2751.32 - lr: 0.000048 - momentum: 0.000000
2023-10-14 01:18:26,882 epoch 2 - iter 396/992 - loss 0.10894772 - time (sec): 23.05 - samples/sec: 2761.13 - lr: 0.000048 - momentum: 0.000000
2023-10-14 01:18:32,637 epoch 2 - iter 495/992 - loss 0.10862939 - time (sec): 28.81 - samples/sec: 2806.00 - lr: 0.000047 - momentum: 0.000000
2023-10-14 01:18:38,576 epoch 2 - iter 594/992 - loss 0.10732068 - time (sec): 34.75 - samples/sec: 2812.77 - lr: 0.000047 - momentum: 0.000000
2023-10-14 01:18:44,413 epoch 2 - iter 693/992 - loss 0.10605689 - time (sec): 40.58 - samples/sec: 2814.76 - lr: 0.000046 - momentum: 0.000000
2023-10-14 01:18:50,172 epoch 2 - iter 792/992 - loss 0.10337874 - time (sec): 46.34 - samples/sec: 2810.16 - lr: 0.000046 - momentum: 0.000000
2023-10-14 01:18:56,347 epoch 2 - iter 891/992 - loss 0.10259994 - time (sec): 52.52 - samples/sec: 2799.94 - lr: 0.000045 - momentum: 0.000000
2023-10-14 01:19:02,162 epoch 2 - iter 990/992 - loss 0.10325477 - time (sec): 58.33 - samples/sec: 2802.95 - lr: 0.000044 - momentum: 0.000000
2023-10-14 01:19:02,321 ----------------------------------------------------------------------------------------------------
2023-10-14 01:19:02,321 EPOCH 2 done: loss 0.1032 - lr: 0.000044
2023-10-14 01:19:05,742 DEV : loss 0.09102991223335266 - f1-score (micro avg) 0.7377
2023-10-14 01:19:05,763 saving best model
2023-10-14 01:19:06,277 ----------------------------------------------------------------------------------------------------
2023-10-14 01:19:11,924 epoch 3 - iter 99/992 - loss 0.06196304 - time (sec): 5.64 - samples/sec: 2675.30 - lr: 0.000044 - momentum: 0.000000
2023-10-14 01:19:17,975 epoch 3 - iter 198/992 - loss 0.06676977 - time (sec): 11.70 - samples/sec: 2774.34 - lr: 0.000043 - momentum: 0.000000
2023-10-14 01:19:23,503 epoch 3 - iter 297/992 - loss 0.07044537 - time (sec): 17.22 - samples/sec: 2780.26 - lr: 0.000043 - momentum: 0.000000
2023-10-14 01:19:29,407 epoch 3 - iter 396/992 - loss 0.07030135 - time (sec): 23.13 - samples/sec: 2755.93 - lr: 0.000042 - momentum: 0.000000
2023-10-14 01:19:35,437 epoch 3 - iter 495/992 - loss 0.06871568 - time (sec): 29.16 - samples/sec: 2793.45 - lr: 0.000042 - momentum: 0.000000
2023-10-14 01:19:41,277 epoch 3 - iter 594/992 - loss 0.07133824 - time (sec): 35.00 - samples/sec: 2795.31 - lr: 0.000041 - momentum: 0.000000
2023-10-14 01:19:47,157 epoch 3 - iter 693/992 - loss 0.07146656 - time (sec): 40.88 - samples/sec: 2803.67 - lr: 0.000041 - momentum: 0.000000
2023-10-14 01:19:53,718 epoch 3 - iter 792/992 - loss 0.07164041 - time (sec): 47.44 - samples/sec: 2769.56 - lr: 0.000040 - momentum: 0.000000
2023-10-14 01:19:59,430 epoch 3 - iter 891/992 - loss 0.07108609 - time (sec): 53.15 - samples/sec: 2765.26 - lr: 0.000039 - momentum: 0.000000
2023-10-14 01:20:05,167 epoch 3 - iter 990/992 - loss 0.07155895 - time (sec): 58.89 - samples/sec: 2778.37 - lr: 0.000039 - momentum: 0.000000
2023-10-14 01:20:05,297 ----------------------------------------------------------------------------------------------------
2023-10-14 01:20:05,298 EPOCH 3 done: loss 0.0715 - lr: 0.000039
2023-10-14 01:20:08,732 DEV : loss 0.11880763620138168 - f1-score (micro avg) 0.7402
2023-10-14 01:20:08,754 saving best model
2023-10-14 01:20:09,277 ----------------------------------------------------------------------------------------------------
2023-10-14 01:20:15,188 epoch 4 - iter 99/992 - loss 0.04511134 - time (sec): 5.91 - samples/sec: 2964.60 - lr: 0.000038 - momentum: 0.000000
2023-10-14 01:20:20,986 epoch 4 - iter 198/992 - loss 0.04991613 - time (sec): 11.71 - samples/sec: 2885.57 - lr: 0.000038 - momentum: 0.000000
2023-10-14 01:20:26,705 epoch 4 - iter 297/992 - loss 0.05461873 - time (sec): 17.43 - samples/sec: 2874.60 - lr: 0.000037 - momentum: 0.000000
2023-10-14 01:20:32,606 epoch 4 - iter 396/992 - loss 0.05366870 - time (sec): 23.33 - samples/sec: 2832.04 - lr: 0.000037 - momentum: 0.000000
2023-10-14 01:20:38,619 epoch 4 - iter 495/992 - loss 0.05304821 - time (sec): 29.34 - samples/sec: 2819.86 - lr: 0.000036 - momentum: 0.000000
2023-10-14 01:20:44,603 epoch 4 - iter 594/992 - loss 0.05335391 - time (sec): 35.32 - samples/sec: 2795.76 - lr: 0.000036 - momentum: 0.000000
2023-10-14 01:20:50,230 epoch 4 - iter 693/992 - loss 0.05328344 - time (sec): 40.95 - samples/sec: 2791.28 - lr: 0.000035 - momentum: 0.000000
2023-10-14 01:20:55,778 epoch 4 - iter 792/992 - loss 0.05344227 - time (sec): 46.50 - samples/sec: 2806.23 - lr: 0.000034 - momentum: 0.000000
2023-10-14 01:21:01,258 epoch 4 - iter 891/992 - loss 0.05342411 - time (sec): 51.98 - samples/sec: 2812.25 - lr: 0.000034 - momentum: 0.000000
2023-10-14 01:21:07,289 epoch 4 - iter 990/992 - loss 0.05595410 - time (sec): 58.01 - samples/sec: 2821.18 - lr: 0.000033 - momentum: 0.000000
2023-10-14 01:21:07,456 ----------------------------------------------------------------------------------------------------
2023-10-14 01:21:07,456 EPOCH 4 done: loss 0.0559 - lr: 0.000033
2023-10-14 01:21:10,852 DEV : loss 0.1232018768787384 - f1-score (micro avg) 0.7481
2023-10-14 01:21:10,872 saving best model
2023-10-14 01:21:11,360 ----------------------------------------------------------------------------------------------------
2023-10-14 01:21:17,069 epoch 5 - iter 99/992 - loss 0.03903289 - time (sec): 5.71 - samples/sec: 2900.52 - lr: 0.000033 - momentum: 0.000000
2023-10-14 01:21:22,768 epoch 5 - iter 198/992 - loss 0.03876095 - time (sec): 11.41 - samples/sec: 2924.11 - lr: 0.000032 - momentum: 0.000000
2023-10-14 01:21:28,419 epoch 5 - iter 297/992 - loss 0.04243056 - time (sec): 17.06 - samples/sec: 2890.32 - lr: 0.000032 - momentum: 0.000000
2023-10-14 01:21:34,101 epoch 5 - iter 396/992 - loss 0.03983144 - time (sec): 22.74 - samples/sec: 2898.00 - lr: 0.000031 - momentum: 0.000000
2023-10-14 01:21:39,707 epoch 5 - iter 495/992 - loss 0.03928814 - time (sec): 28.34 - samples/sec: 2904.32 - lr: 0.000031 - momentum: 0.000000
2023-10-14 01:21:45,374 epoch 5 - iter 594/992 - loss 0.03949960 - time (sec): 34.01 - samples/sec: 2907.48 - lr: 0.000030 - momentum: 0.000000
2023-10-14 01:21:50,895 epoch 5 - iter 693/992 - loss 0.04107109 - time (sec): 39.53 - samples/sec: 2888.93 - lr: 0.000029 - momentum: 0.000000
2023-10-14 01:21:57,000 epoch 5 - iter 792/992 - loss 0.04131385 - time (sec): 45.64 - samples/sec: 2875.31 - lr: 0.000029 - momentum: 0.000000
2023-10-14 01:22:02,977 epoch 5 - iter 891/992 - loss 0.04155687 - time (sec): 51.61 - samples/sec: 2853.04 - lr: 0.000028 - momentum: 0.000000
2023-10-14 01:22:08,801 epoch 5 - iter 990/992 - loss 0.04121265 - time (sec): 57.44 - samples/sec: 2850.10 - lr: 0.000028 - momentum: 0.000000
2023-10-14 01:22:08,913 ----------------------------------------------------------------------------------------------------
2023-10-14 01:22:08,913 EPOCH 5 done: loss 0.0413 - lr: 0.000028
2023-10-14 01:22:12,796 DEV : loss 0.16722512245178223 - f1-score (micro avg) 0.7586
2023-10-14 01:22:12,817 saving best model
2023-10-14 01:22:13,338 ----------------------------------------------------------------------------------------------------
2023-10-14 01:22:19,723 epoch 6 - iter 99/992 - loss 0.03646699 - time (sec): 6.38 - samples/sec: 2714.64 - lr: 0.000027 - momentum: 0.000000
2023-10-14 01:22:25,276 epoch 6 - iter 198/992 - loss 0.03616235 - time (sec): 11.94 - samples/sec: 2798.79 - lr: 0.000027 - momentum: 0.000000
2023-10-14 01:22:30,923 epoch 6 - iter 297/992 - loss 0.03179580 - time (sec): 17.58 - samples/sec: 2799.10 - lr: 0.000026 - momentum: 0.000000
2023-10-14 01:22:36,766 epoch 6 - iter 396/992 - loss 0.03143445 - time (sec): 23.43 - samples/sec: 2811.42 - lr: 0.000026 - momentum: 0.000000
2023-10-14 01:22:42,630 epoch 6 - iter 495/992 - loss 0.03072024 - time (sec): 29.29 - samples/sec: 2815.34 - lr: 0.000025 - momentum: 0.000000
2023-10-14 01:22:48,259 epoch 6 - iter 594/992 - loss 0.03079986 - time (sec): 34.92 - samples/sec: 2818.20 - lr: 0.000024 - momentum: 0.000000
2023-10-14 01:22:54,316 epoch 6 - iter 693/992 - loss 0.03043770 - time (sec): 40.98 - samples/sec: 2799.34 - lr: 0.000024 - momentum: 0.000000
2023-10-14 01:23:00,319 epoch 6 - iter 792/992 - loss 0.02994022 - time (sec): 46.98 - samples/sec: 2790.61 - lr: 0.000023 - momentum: 0.000000
2023-10-14 01:23:06,471 epoch 6 - iter 891/992 - loss 0.03012561 - time (sec): 53.13 - samples/sec: 2787.05 - lr: 0.000023 - momentum: 0.000000
2023-10-14 01:23:12,180 epoch 6 - iter 990/992 - loss 0.03016026 - time (sec): 58.84 - samples/sec: 2782.11 - lr: 0.000022 - momentum: 0.000000
2023-10-14 01:23:12,291 ----------------------------------------------------------------------------------------------------
2023-10-14 01:23:12,291 EPOCH 6 done: loss 0.0301 - lr: 0.000022
2023-10-14 01:23:15,730 DEV : loss 0.17587369680404663 - f1-score (micro avg) 0.7538
2023-10-14 01:23:15,751 ----------------------------------------------------------------------------------------------------
2023-10-14 01:23:21,560 epoch 7 - iter 99/992 - loss 0.02680858 - time (sec): 5.81 - samples/sec: 2783.87 - lr: 0.000022 - momentum: 0.000000
2023-10-14 01:23:27,386 epoch 7 - iter 198/992 - loss 0.03036132 - time (sec): 11.63 - samples/sec: 2760.05 - lr: 0.000021 - momentum: 0.000000
2023-10-14 01:23:33,314 epoch 7 - iter 297/992 - loss 0.02504973 - time (sec): 17.56 - samples/sec: 2794.69 - lr: 0.000021 - momentum: 0.000000
2023-10-14 01:23:39,233 epoch 7 - iter 396/992 - loss 0.02602930 - time (sec): 23.48 - samples/sec: 2795.69 - lr: 0.000020 - momentum: 0.000000
2023-10-14 01:23:44,944 epoch 7 - iter 495/992 - loss 0.02459364 - time (sec): 29.19 - samples/sec: 2796.12 - lr: 0.000019 - momentum: 0.000000
2023-10-14 01:23:50,850 epoch 7 - iter 594/992 - loss 0.02499387 - time (sec): 35.10 - samples/sec: 2800.96 - lr: 0.000019 - momentum: 0.000000
2023-10-14 01:23:56,991 epoch 7 - iter 693/992 - loss 0.02457250 - time (sec): 41.24 - samples/sec: 2789.41 - lr: 0.000018 - momentum: 0.000000
2023-10-14 01:24:02,722 epoch 7 - iter 792/992 - loss 0.02502738 - time (sec): 46.97 - samples/sec: 2791.68 - lr: 0.000018 - momentum: 0.000000
2023-10-14 01:24:08,621 epoch 7 - iter 891/992 - loss 0.02451035 - time (sec): 52.87 - samples/sec: 2789.21 - lr: 0.000017 - momentum: 0.000000
2023-10-14 01:24:14,382 epoch 7 - iter 990/992 - loss 0.02379099 - time (sec): 58.63 - samples/sec: 2791.31 - lr: 0.000017 - momentum: 0.000000
2023-10-14 01:24:14,488 ----------------------------------------------------------------------------------------------------
2023-10-14 01:24:14,489 EPOCH 7 done: loss 0.0238 - lr: 0.000017
2023-10-14 01:24:18,294 DEV : loss 0.1903119683265686 - f1-score (micro avg) 0.7529
2023-10-14 01:24:18,315 ----------------------------------------------------------------------------------------------------
2023-10-14 01:24:24,188 epoch 8 - iter 99/992 - loss 0.01483288 - time (sec): 5.87 - samples/sec: 2925.75 - lr: 0.000016 - momentum: 0.000000
2023-10-14 01:24:29,964 epoch 8 - iter 198/992 - loss 0.01246495 - time (sec): 11.65 - samples/sec: 2854.69 - lr: 0.000016 - momentum: 0.000000
2023-10-14 01:24:35,622 epoch 8 - iter 297/992 - loss 0.01429814 - time (sec): 17.31 - samples/sec: 2821.25 - lr: 0.000015 - momentum: 0.000000
2023-10-14 01:24:41,814 epoch 8 - iter 396/992 - loss 0.01483730 - time (sec): 23.50 - samples/sec: 2813.76 - lr: 0.000014 - momentum: 0.000000
2023-10-14 01:24:47,786 epoch 8 - iter 495/992 - loss 0.01516838 - time (sec): 29.47 - samples/sec: 2816.87 - lr: 0.000014 - momentum: 0.000000
2023-10-14 01:24:53,786 epoch 8 - iter 594/992 - loss 0.01536659 - time (sec): 35.47 - samples/sec: 2816.83 - lr: 0.000013 - momentum: 0.000000
2023-10-14 01:24:59,378 epoch 8 - iter 693/992 - loss 0.01490078 - time (sec): 41.06 - samples/sec: 2828.30 - lr: 0.000013 - momentum: 0.000000
2023-10-14 01:25:05,215 epoch 8 - iter 792/992 - loss 0.01534412 - time (sec): 46.90 - samples/sec: 2812.41 - lr: 0.000012 - momentum: 0.000000
2023-10-14 01:25:11,007 epoch 8 - iter 891/992 - loss 0.01570233 - time (sec): 52.69 - samples/sec: 2806.90 - lr: 0.000012 - momentum: 0.000000
2023-10-14 01:25:16,688 epoch 8 - iter 990/992 - loss 0.01551159 - time (sec): 58.37 - samples/sec: 2805.59 - lr: 0.000011 - momentum: 0.000000
2023-10-14 01:25:16,785 ----------------------------------------------------------------------------------------------------
2023-10-14 01:25:16,785 EPOCH 8 done: loss 0.0155 - lr: 0.000011
2023-10-14 01:25:20,520 DEV : loss 0.20634520053863525 - f1-score (micro avg) 0.7621
2023-10-14 01:25:20,553 saving best model
2023-10-14 01:25:21,058 ----------------------------------------------------------------------------------------------------
2023-10-14 01:25:26,647 epoch 9 - iter 99/992 - loss 0.01027848 - time (sec): 5.59 - samples/sec: 2895.54 - lr: 0.000011 - momentum: 0.000000
2023-10-14 01:25:32,581 epoch 9 - iter 198/992 - loss 0.01067700 - time (sec): 11.52 - samples/sec: 2883.19 - lr: 0.000010 - momentum: 0.000000
2023-10-14 01:25:38,714 epoch 9 - iter 297/992 - loss 0.01091812 - time (sec): 17.65 - samples/sec: 2834.44 - lr: 0.000009 - momentum: 0.000000
2023-10-14 01:25:44,463 epoch 9 - iter 396/992 - loss 0.01073457 - time (sec): 23.40 - samples/sec: 2805.81 - lr: 0.000009 - momentum: 0.000000
2023-10-14 01:25:50,371 epoch 9 - iter 495/992 - loss 0.01029950 - time (sec): 29.31 - samples/sec: 2807.06 - lr: 0.000008 - momentum: 0.000000
2023-10-14 01:25:56,117 epoch 9 - iter 594/992 - loss 0.01110924 - time (sec): 35.06 - samples/sec: 2815.10 - lr: 0.000008 - momentum: 0.000000
2023-10-14 01:26:02,160 epoch 9 - iter 693/992 - loss 0.01176843 - time (sec): 41.10 - samples/sec: 2792.15 - lr: 0.000007 - momentum: 0.000000
2023-10-14 01:26:08,166 epoch 9 - iter 792/992 - loss 0.01153996 - time (sec): 47.11 - samples/sec: 2796.98 - lr: 0.000007 - momentum: 0.000000
2023-10-14 01:26:13,818 epoch 9 - iter 891/992 - loss 0.01138962 - time (sec): 52.76 - samples/sec: 2798.37 - lr: 0.000006 - momentum: 0.000000
2023-10-14 01:26:19,543 epoch 9 - iter 990/992 - loss 0.01156023 - time (sec): 58.48 - samples/sec: 2796.42 - lr: 0.000006 - momentum: 0.000000
2023-10-14 01:26:19,696 ----------------------------------------------------------------------------------------------------
2023-10-14 01:26:19,696 EPOCH 9 done: loss 0.0115 - lr: 0.000006
2023-10-14 01:26:23,692 DEV : loss 0.2140767127275467 - f1-score (micro avg) 0.7623
2023-10-14 01:26:23,714 saving best model
2023-10-14 01:26:24,211 ----------------------------------------------------------------------------------------------------
2023-10-14 01:26:30,251 epoch 10 - iter 99/992 - loss 0.00607883 - time (sec): 6.04 - samples/sec: 2911.03 - lr: 0.000005 - momentum: 0.000000
2023-10-14 01:26:36,244 epoch 10 - iter 198/992 - loss 0.00672977 - time (sec): 12.03 - samples/sec: 2833.64 - lr: 0.000004 - momentum: 0.000000
2023-10-14 01:26:41,831 epoch 10 - iter 297/992 - loss 0.00683740 - time (sec): 17.62 - samples/sec: 2791.38 - lr: 0.000004 - momentum: 0.000000
2023-10-14 01:26:47,746 epoch 10 - iter 396/992 - loss 0.00754100 - time (sec): 23.53 - samples/sec: 2795.11 - lr: 0.000003 - momentum: 0.000000
2023-10-14 01:26:53,622 epoch 10 - iter 495/992 - loss 0.00737902 - time (sec): 29.41 - samples/sec: 2800.18 - lr: 0.000003 - momentum: 0.000000
2023-10-14 01:26:59,458 epoch 10 - iter 594/992 - loss 0.00713022 - time (sec): 35.24 - samples/sec: 2791.65 - lr: 0.000002 - momentum: 0.000000
2023-10-14 01:27:05,247 epoch 10 - iter 693/992 - loss 0.00806298 - time (sec): 41.03 - samples/sec: 2793.59 - lr: 0.000002 - momentum: 0.000000
2023-10-14 01:27:11,236 epoch 10 - iter 792/992 - loss 0.00809314 - time (sec): 47.02 - samples/sec: 2792.23 - lr: 0.000001 - momentum: 0.000000
2023-10-14 01:27:16,908 epoch 10 - iter 891/992 - loss 0.00803018 - time (sec): 52.69 - samples/sec: 2806.21 - lr: 0.000001 - momentum: 0.000000
2023-10-14 01:27:22,660 epoch 10 - iter 990/992 - loss 0.00838705 - time (sec): 58.45 - samples/sec: 2800.71 - lr: 0.000000 - momentum: 0.000000
2023-10-14 01:27:22,767 ----------------------------------------------------------------------------------------------------
2023-10-14 01:27:22,767 EPOCH 10 done: loss 0.0084 - lr: 0.000000
2023-10-14 01:27:26,208 DEV : loss 0.22556838393211365 - f1-score (micro avg) 0.7641
2023-10-14 01:27:26,232 saving best model
2023-10-14 01:27:27,131 ----------------------------------------------------------------------------------------------------
2023-10-14 01:27:27,132 Loading model from best epoch ...
2023-10-14 01:27:28,425 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-14 01:27:31,682
Results:
- F-score (micro) 0.7925
- F-score (macro) 0.712
- Accuracy 0.6784
By class:
precision recall f1-score support
LOC 0.8363 0.8656 0.8507 655
PER 0.7336 0.8027 0.7666 223
ORG 0.5536 0.4882 0.5188 127
micro avg 0.7814 0.8040 0.7925 1005
macro avg 0.7078 0.7188 0.7120 1005
weighted avg 0.7778 0.8040 0.7901 1005
2023-10-14 01:27:31,682 ----------------------------------------------------------------------------------------------------