2023-10-14 01:17:01,167 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:17:01,168 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-14 01:17:01,168 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:17:01,168 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-14 01:17:01,168 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:17:01,168 Train: 7936 sentences 2023-10-14 01:17:01,168 (train_with_dev=False, train_with_test=False) 2023-10-14 01:17:01,168 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:17:01,168 Training Params: 2023-10-14 01:17:01,168 - learning_rate: "5e-05" 2023-10-14 01:17:01,168 - mini_batch_size: "8" 2023-10-14 01:17:01,168 - max_epochs: "10" 2023-10-14 01:17:01,168 - shuffle: "True" 2023-10-14 01:17:01,168 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:17:01,168 Plugins: 2023-10-14 01:17:01,168 - LinearScheduler | warmup_fraction: '0.1' 2023-10-14 01:17:01,169 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:17:01,169 Final evaluation on model from best epoch (best-model.pt) 2023-10-14 01:17:01,169 - metric: "('micro avg', 'f1-score')" 2023-10-14 01:17:01,169 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:17:01,169 Computation: 2023-10-14 01:17:01,169 - compute on device: cuda:0 2023-10-14 01:17:01,169 - embedding storage: none 2023-10-14 01:17:01,169 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:17:01,169 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-14 01:17:01,169 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:17:01,169 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:17:06,714 epoch 1 - iter 99/992 - loss 1.86588415 - time (sec): 5.54 - samples/sec: 2785.36 - lr: 0.000005 - momentum: 0.000000 2023-10-14 01:17:12,449 epoch 1 - iter 198/992 - loss 1.09873625 - time (sec): 11.28 - samples/sec: 2794.75 - lr: 0.000010 - momentum: 0.000000 2023-10-14 01:17:18,387 epoch 1 - iter 297/992 - loss 0.80249953 - time (sec): 17.22 - samples/sec: 2794.10 - lr: 0.000015 - momentum: 0.000000 2023-10-14 01:17:23,944 epoch 1 - iter 396/992 - loss 0.64530064 - time (sec): 22.77 - samples/sec: 2826.21 - lr: 0.000020 - momentum: 0.000000 2023-10-14 01:17:29,847 epoch 1 - iter 495/992 - loss 0.54951004 - time (sec): 28.68 - samples/sec: 2818.40 - lr: 0.000025 - momentum: 0.000000 2023-10-14 01:17:35,860 epoch 1 - iter 594/992 - loss 0.47822528 - time (sec): 34.69 - samples/sec: 2821.08 - lr: 0.000030 - momentum: 0.000000 2023-10-14 01:17:41,757 epoch 1 - iter 693/992 - loss 0.43156759 - time (sec): 40.59 - samples/sec: 2802.37 - lr: 0.000035 - momentum: 0.000000 2023-10-14 01:17:47,720 epoch 1 - iter 792/992 - loss 0.39387496 - time (sec): 46.55 - samples/sec: 2794.11 - lr: 0.000040 - momentum: 0.000000 2023-10-14 01:17:53,595 epoch 1 - iter 891/992 - loss 0.36458290 - time (sec): 52.42 - samples/sec: 2790.26 - lr: 0.000045 - momentum: 0.000000 2023-10-14 01:17:59,691 epoch 1 - iter 990/992 - loss 0.34069275 - time (sec): 58.52 - samples/sec: 2792.26 - lr: 0.000050 - momentum: 0.000000 2023-10-14 01:17:59,900 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:17:59,900 EPOCH 1 done: loss 0.3399 - lr: 0.000050 2023-10-14 01:18:03,409 DEV : loss 0.09731486439704895 - f1-score (micro avg) 0.6696 2023-10-14 01:18:03,433 saving best model 2023-10-14 01:18:03,828 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:18:09,534 epoch 2 - iter 99/992 - loss 0.12915040 - time (sec): 5.70 - samples/sec: 2665.74 - lr: 0.000049 - momentum: 0.000000 2023-10-14 01:18:15,365 epoch 2 - iter 198/992 - loss 0.11559230 - time (sec): 11.54 - samples/sec: 2704.02 - lr: 0.000049 - momentum: 0.000000 2023-10-14 01:18:20,960 epoch 2 - iter 297/992 - loss 0.11414920 - time (sec): 17.13 - samples/sec: 2751.32 - lr: 0.000048 - momentum: 0.000000 2023-10-14 01:18:26,882 epoch 2 - iter 396/992 - loss 0.10894772 - time (sec): 23.05 - samples/sec: 2761.13 - lr: 0.000048 - momentum: 0.000000 2023-10-14 01:18:32,637 epoch 2 - iter 495/992 - loss 0.10862939 - time (sec): 28.81 - samples/sec: 2806.00 - lr: 0.000047 - momentum: 0.000000 2023-10-14 01:18:38,576 epoch 2 - iter 594/992 - loss 0.10732068 - time (sec): 34.75 - samples/sec: 2812.77 - lr: 0.000047 - momentum: 0.000000 2023-10-14 01:18:44,413 epoch 2 - iter 693/992 - loss 0.10605689 - time (sec): 40.58 - samples/sec: 2814.76 - lr: 0.000046 - momentum: 0.000000 2023-10-14 01:18:50,172 epoch 2 - iter 792/992 - loss 0.10337874 - time (sec): 46.34 - samples/sec: 2810.16 - lr: 0.000046 - momentum: 0.000000 2023-10-14 01:18:56,347 epoch 2 - iter 891/992 - loss 0.10259994 - time (sec): 52.52 - samples/sec: 2799.94 - lr: 0.000045 - momentum: 0.000000 2023-10-14 01:19:02,162 epoch 2 - iter 990/992 - loss 0.10325477 - time (sec): 58.33 - samples/sec: 2802.95 - lr: 0.000044 - momentum: 0.000000 2023-10-14 01:19:02,321 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:19:02,321 EPOCH 2 done: loss 0.1032 - lr: 0.000044 2023-10-14 01:19:05,742 DEV : loss 0.09102991223335266 - f1-score (micro avg) 0.7377 2023-10-14 01:19:05,763 saving best model 2023-10-14 01:19:06,277 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:19:11,924 epoch 3 - iter 99/992 - loss 0.06196304 - time (sec): 5.64 - samples/sec: 2675.30 - lr: 0.000044 - momentum: 0.000000 2023-10-14 01:19:17,975 epoch 3 - iter 198/992 - loss 0.06676977 - time (sec): 11.70 - samples/sec: 2774.34 - lr: 0.000043 - momentum: 0.000000 2023-10-14 01:19:23,503 epoch 3 - iter 297/992 - loss 0.07044537 - time (sec): 17.22 - samples/sec: 2780.26 - lr: 0.000043 - momentum: 0.000000 2023-10-14 01:19:29,407 epoch 3 - iter 396/992 - loss 0.07030135 - time (sec): 23.13 - samples/sec: 2755.93 - lr: 0.000042 - momentum: 0.000000 2023-10-14 01:19:35,437 epoch 3 - iter 495/992 - loss 0.06871568 - time (sec): 29.16 - samples/sec: 2793.45 - lr: 0.000042 - momentum: 0.000000 2023-10-14 01:19:41,277 epoch 3 - iter 594/992 - loss 0.07133824 - time (sec): 35.00 - samples/sec: 2795.31 - lr: 0.000041 - momentum: 0.000000 2023-10-14 01:19:47,157 epoch 3 - iter 693/992 - loss 0.07146656 - time (sec): 40.88 - samples/sec: 2803.67 - lr: 0.000041 - momentum: 0.000000 2023-10-14 01:19:53,718 epoch 3 - iter 792/992 - loss 0.07164041 - time (sec): 47.44 - samples/sec: 2769.56 - lr: 0.000040 - momentum: 0.000000 2023-10-14 01:19:59,430 epoch 3 - iter 891/992 - loss 0.07108609 - time (sec): 53.15 - samples/sec: 2765.26 - lr: 0.000039 - momentum: 0.000000 2023-10-14 01:20:05,167 epoch 3 - iter 990/992 - loss 0.07155895 - time (sec): 58.89 - samples/sec: 2778.37 - lr: 0.000039 - momentum: 0.000000 2023-10-14 01:20:05,297 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:20:05,298 EPOCH 3 done: loss 0.0715 - lr: 0.000039 2023-10-14 01:20:08,732 DEV : loss 0.11880763620138168 - f1-score (micro avg) 0.7402 2023-10-14 01:20:08,754 saving best model 2023-10-14 01:20:09,277 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:20:15,188 epoch 4 - iter 99/992 - loss 0.04511134 - time (sec): 5.91 - samples/sec: 2964.60 - lr: 0.000038 - momentum: 0.000000 2023-10-14 01:20:20,986 epoch 4 - iter 198/992 - loss 0.04991613 - time (sec): 11.71 - samples/sec: 2885.57 - lr: 0.000038 - momentum: 0.000000 2023-10-14 01:20:26,705 epoch 4 - iter 297/992 - loss 0.05461873 - time (sec): 17.43 - samples/sec: 2874.60 - lr: 0.000037 - momentum: 0.000000 2023-10-14 01:20:32,606 epoch 4 - iter 396/992 - loss 0.05366870 - time (sec): 23.33 - samples/sec: 2832.04 - lr: 0.000037 - momentum: 0.000000 2023-10-14 01:20:38,619 epoch 4 - iter 495/992 - loss 0.05304821 - time (sec): 29.34 - samples/sec: 2819.86 - lr: 0.000036 - momentum: 0.000000 2023-10-14 01:20:44,603 epoch 4 - iter 594/992 - loss 0.05335391 - time (sec): 35.32 - samples/sec: 2795.76 - lr: 0.000036 - momentum: 0.000000 2023-10-14 01:20:50,230 epoch 4 - iter 693/992 - loss 0.05328344 - time (sec): 40.95 - samples/sec: 2791.28 - lr: 0.000035 - momentum: 0.000000 2023-10-14 01:20:55,778 epoch 4 - iter 792/992 - loss 0.05344227 - time (sec): 46.50 - samples/sec: 2806.23 - lr: 0.000034 - momentum: 0.000000 2023-10-14 01:21:01,258 epoch 4 - iter 891/992 - loss 0.05342411 - time (sec): 51.98 - samples/sec: 2812.25 - lr: 0.000034 - momentum: 0.000000 2023-10-14 01:21:07,289 epoch 4 - iter 990/992 - loss 0.05595410 - time (sec): 58.01 - samples/sec: 2821.18 - lr: 0.000033 - momentum: 0.000000 2023-10-14 01:21:07,456 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:21:07,456 EPOCH 4 done: loss 0.0559 - lr: 0.000033 2023-10-14 01:21:10,852 DEV : loss 0.1232018768787384 - f1-score (micro avg) 0.7481 2023-10-14 01:21:10,872 saving best model 2023-10-14 01:21:11,360 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:21:17,069 epoch 5 - iter 99/992 - loss 0.03903289 - time (sec): 5.71 - samples/sec: 2900.52 - lr: 0.000033 - momentum: 0.000000 2023-10-14 01:21:22,768 epoch 5 - iter 198/992 - loss 0.03876095 - time (sec): 11.41 - samples/sec: 2924.11 - lr: 0.000032 - momentum: 0.000000 2023-10-14 01:21:28,419 epoch 5 - iter 297/992 - loss 0.04243056 - time (sec): 17.06 - samples/sec: 2890.32 - lr: 0.000032 - momentum: 0.000000 2023-10-14 01:21:34,101 epoch 5 - iter 396/992 - loss 0.03983144 - time (sec): 22.74 - samples/sec: 2898.00 - lr: 0.000031 - momentum: 0.000000 2023-10-14 01:21:39,707 epoch 5 - iter 495/992 - loss 0.03928814 - time (sec): 28.34 - samples/sec: 2904.32 - lr: 0.000031 - momentum: 0.000000 2023-10-14 01:21:45,374 epoch 5 - iter 594/992 - loss 0.03949960 - time (sec): 34.01 - samples/sec: 2907.48 - lr: 0.000030 - momentum: 0.000000 2023-10-14 01:21:50,895 epoch 5 - iter 693/992 - loss 0.04107109 - time (sec): 39.53 - samples/sec: 2888.93 - lr: 0.000029 - momentum: 0.000000 2023-10-14 01:21:57,000 epoch 5 - iter 792/992 - loss 0.04131385 - time (sec): 45.64 - samples/sec: 2875.31 - lr: 0.000029 - momentum: 0.000000 2023-10-14 01:22:02,977 epoch 5 - iter 891/992 - loss 0.04155687 - time (sec): 51.61 - samples/sec: 2853.04 - lr: 0.000028 - momentum: 0.000000 2023-10-14 01:22:08,801 epoch 5 - iter 990/992 - loss 0.04121265 - time (sec): 57.44 - samples/sec: 2850.10 - lr: 0.000028 - momentum: 0.000000 2023-10-14 01:22:08,913 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:22:08,913 EPOCH 5 done: loss 0.0413 - lr: 0.000028 2023-10-14 01:22:12,796 DEV : loss 0.16722512245178223 - f1-score (micro avg) 0.7586 2023-10-14 01:22:12,817 saving best model 2023-10-14 01:22:13,338 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:22:19,723 epoch 6 - iter 99/992 - loss 0.03646699 - time (sec): 6.38 - samples/sec: 2714.64 - lr: 0.000027 - momentum: 0.000000 2023-10-14 01:22:25,276 epoch 6 - iter 198/992 - loss 0.03616235 - time (sec): 11.94 - samples/sec: 2798.79 - lr: 0.000027 - momentum: 0.000000 2023-10-14 01:22:30,923 epoch 6 - iter 297/992 - loss 0.03179580 - time (sec): 17.58 - samples/sec: 2799.10 - lr: 0.000026 - momentum: 0.000000 2023-10-14 01:22:36,766 epoch 6 - iter 396/992 - loss 0.03143445 - time (sec): 23.43 - samples/sec: 2811.42 - lr: 0.000026 - momentum: 0.000000 2023-10-14 01:22:42,630 epoch 6 - iter 495/992 - loss 0.03072024 - time (sec): 29.29 - samples/sec: 2815.34 - lr: 0.000025 - momentum: 0.000000 2023-10-14 01:22:48,259 epoch 6 - iter 594/992 - loss 0.03079986 - time (sec): 34.92 - samples/sec: 2818.20 - lr: 0.000024 - momentum: 0.000000 2023-10-14 01:22:54,316 epoch 6 - iter 693/992 - loss 0.03043770 - time (sec): 40.98 - samples/sec: 2799.34 - lr: 0.000024 - momentum: 0.000000 2023-10-14 01:23:00,319 epoch 6 - iter 792/992 - loss 0.02994022 - time (sec): 46.98 - samples/sec: 2790.61 - lr: 0.000023 - momentum: 0.000000 2023-10-14 01:23:06,471 epoch 6 - iter 891/992 - loss 0.03012561 - time (sec): 53.13 - samples/sec: 2787.05 - lr: 0.000023 - momentum: 0.000000 2023-10-14 01:23:12,180 epoch 6 - iter 990/992 - loss 0.03016026 - time (sec): 58.84 - samples/sec: 2782.11 - lr: 0.000022 - momentum: 0.000000 2023-10-14 01:23:12,291 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:23:12,291 EPOCH 6 done: loss 0.0301 - lr: 0.000022 2023-10-14 01:23:15,730 DEV : loss 0.17587369680404663 - f1-score (micro avg) 0.7538 2023-10-14 01:23:15,751 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:23:21,560 epoch 7 - iter 99/992 - loss 0.02680858 - time (sec): 5.81 - samples/sec: 2783.87 - lr: 0.000022 - momentum: 0.000000 2023-10-14 01:23:27,386 epoch 7 - iter 198/992 - loss 0.03036132 - time (sec): 11.63 - samples/sec: 2760.05 - lr: 0.000021 - momentum: 0.000000 2023-10-14 01:23:33,314 epoch 7 - iter 297/992 - loss 0.02504973 - time (sec): 17.56 - samples/sec: 2794.69 - lr: 0.000021 - momentum: 0.000000 2023-10-14 01:23:39,233 epoch 7 - iter 396/992 - loss 0.02602930 - time (sec): 23.48 - samples/sec: 2795.69 - lr: 0.000020 - momentum: 0.000000 2023-10-14 01:23:44,944 epoch 7 - iter 495/992 - loss 0.02459364 - time (sec): 29.19 - samples/sec: 2796.12 - lr: 0.000019 - momentum: 0.000000 2023-10-14 01:23:50,850 epoch 7 - iter 594/992 - loss 0.02499387 - time (sec): 35.10 - samples/sec: 2800.96 - lr: 0.000019 - momentum: 0.000000 2023-10-14 01:23:56,991 epoch 7 - iter 693/992 - loss 0.02457250 - time (sec): 41.24 - samples/sec: 2789.41 - lr: 0.000018 - momentum: 0.000000 2023-10-14 01:24:02,722 epoch 7 - iter 792/992 - loss 0.02502738 - time (sec): 46.97 - samples/sec: 2791.68 - lr: 0.000018 - momentum: 0.000000 2023-10-14 01:24:08,621 epoch 7 - iter 891/992 - loss 0.02451035 - time (sec): 52.87 - samples/sec: 2789.21 - lr: 0.000017 - momentum: 0.000000 2023-10-14 01:24:14,382 epoch 7 - iter 990/992 - loss 0.02379099 - time (sec): 58.63 - samples/sec: 2791.31 - lr: 0.000017 - momentum: 0.000000 2023-10-14 01:24:14,488 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:24:14,489 EPOCH 7 done: loss 0.0238 - lr: 0.000017 2023-10-14 01:24:18,294 DEV : loss 0.1903119683265686 - f1-score (micro avg) 0.7529 2023-10-14 01:24:18,315 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:24:24,188 epoch 8 - iter 99/992 - loss 0.01483288 - time (sec): 5.87 - samples/sec: 2925.75 - lr: 0.000016 - momentum: 0.000000 2023-10-14 01:24:29,964 epoch 8 - iter 198/992 - loss 0.01246495 - time (sec): 11.65 - samples/sec: 2854.69 - lr: 0.000016 - momentum: 0.000000 2023-10-14 01:24:35,622 epoch 8 - iter 297/992 - loss 0.01429814 - time (sec): 17.31 - samples/sec: 2821.25 - lr: 0.000015 - momentum: 0.000000 2023-10-14 01:24:41,814 epoch 8 - iter 396/992 - loss 0.01483730 - time (sec): 23.50 - samples/sec: 2813.76 - lr: 0.000014 - momentum: 0.000000 2023-10-14 01:24:47,786 epoch 8 - iter 495/992 - loss 0.01516838 - time (sec): 29.47 - samples/sec: 2816.87 - lr: 0.000014 - momentum: 0.000000 2023-10-14 01:24:53,786 epoch 8 - iter 594/992 - loss 0.01536659 - time (sec): 35.47 - samples/sec: 2816.83 - lr: 0.000013 - momentum: 0.000000 2023-10-14 01:24:59,378 epoch 8 - iter 693/992 - loss 0.01490078 - time (sec): 41.06 - samples/sec: 2828.30 - lr: 0.000013 - momentum: 0.000000 2023-10-14 01:25:05,215 epoch 8 - iter 792/992 - loss 0.01534412 - time (sec): 46.90 - samples/sec: 2812.41 - lr: 0.000012 - momentum: 0.000000 2023-10-14 01:25:11,007 epoch 8 - iter 891/992 - loss 0.01570233 - time (sec): 52.69 - samples/sec: 2806.90 - lr: 0.000012 - momentum: 0.000000 2023-10-14 01:25:16,688 epoch 8 - iter 990/992 - loss 0.01551159 - time (sec): 58.37 - samples/sec: 2805.59 - lr: 0.000011 - momentum: 0.000000 2023-10-14 01:25:16,785 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:25:16,785 EPOCH 8 done: loss 0.0155 - lr: 0.000011 2023-10-14 01:25:20,520 DEV : loss 0.20634520053863525 - f1-score (micro avg) 0.7621 2023-10-14 01:25:20,553 saving best model 2023-10-14 01:25:21,058 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:25:26,647 epoch 9 - iter 99/992 - loss 0.01027848 - time (sec): 5.59 - samples/sec: 2895.54 - lr: 0.000011 - momentum: 0.000000 2023-10-14 01:25:32,581 epoch 9 - iter 198/992 - loss 0.01067700 - time (sec): 11.52 - samples/sec: 2883.19 - lr: 0.000010 - momentum: 0.000000 2023-10-14 01:25:38,714 epoch 9 - iter 297/992 - loss 0.01091812 - time (sec): 17.65 - samples/sec: 2834.44 - lr: 0.000009 - momentum: 0.000000 2023-10-14 01:25:44,463 epoch 9 - iter 396/992 - loss 0.01073457 - time (sec): 23.40 - samples/sec: 2805.81 - lr: 0.000009 - momentum: 0.000000 2023-10-14 01:25:50,371 epoch 9 - iter 495/992 - loss 0.01029950 - time (sec): 29.31 - samples/sec: 2807.06 - lr: 0.000008 - momentum: 0.000000 2023-10-14 01:25:56,117 epoch 9 - iter 594/992 - loss 0.01110924 - time (sec): 35.06 - samples/sec: 2815.10 - lr: 0.000008 - momentum: 0.000000 2023-10-14 01:26:02,160 epoch 9 - iter 693/992 - loss 0.01176843 - time (sec): 41.10 - samples/sec: 2792.15 - lr: 0.000007 - momentum: 0.000000 2023-10-14 01:26:08,166 epoch 9 - iter 792/992 - loss 0.01153996 - time (sec): 47.11 - samples/sec: 2796.98 - lr: 0.000007 - momentum: 0.000000 2023-10-14 01:26:13,818 epoch 9 - iter 891/992 - loss 0.01138962 - time (sec): 52.76 - samples/sec: 2798.37 - lr: 0.000006 - momentum: 0.000000 2023-10-14 01:26:19,543 epoch 9 - iter 990/992 - loss 0.01156023 - time (sec): 58.48 - samples/sec: 2796.42 - lr: 0.000006 - momentum: 0.000000 2023-10-14 01:26:19,696 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:26:19,696 EPOCH 9 done: loss 0.0115 - lr: 0.000006 2023-10-14 01:26:23,692 DEV : loss 0.2140767127275467 - f1-score (micro avg) 0.7623 2023-10-14 01:26:23,714 saving best model 2023-10-14 01:26:24,211 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:26:30,251 epoch 10 - iter 99/992 - loss 0.00607883 - time (sec): 6.04 - samples/sec: 2911.03 - lr: 0.000005 - momentum: 0.000000 2023-10-14 01:26:36,244 epoch 10 - iter 198/992 - loss 0.00672977 - time (sec): 12.03 - samples/sec: 2833.64 - lr: 0.000004 - momentum: 0.000000 2023-10-14 01:26:41,831 epoch 10 - iter 297/992 - loss 0.00683740 - time (sec): 17.62 - samples/sec: 2791.38 - lr: 0.000004 - momentum: 0.000000 2023-10-14 01:26:47,746 epoch 10 - iter 396/992 - loss 0.00754100 - time (sec): 23.53 - samples/sec: 2795.11 - lr: 0.000003 - momentum: 0.000000 2023-10-14 01:26:53,622 epoch 10 - iter 495/992 - loss 0.00737902 - time (sec): 29.41 - samples/sec: 2800.18 - lr: 0.000003 - momentum: 0.000000 2023-10-14 01:26:59,458 epoch 10 - iter 594/992 - loss 0.00713022 - time (sec): 35.24 - samples/sec: 2791.65 - lr: 0.000002 - momentum: 0.000000 2023-10-14 01:27:05,247 epoch 10 - iter 693/992 - loss 0.00806298 - time (sec): 41.03 - samples/sec: 2793.59 - lr: 0.000002 - momentum: 0.000000 2023-10-14 01:27:11,236 epoch 10 - iter 792/992 - loss 0.00809314 - time (sec): 47.02 - samples/sec: 2792.23 - lr: 0.000001 - momentum: 0.000000 2023-10-14 01:27:16,908 epoch 10 - iter 891/992 - loss 0.00803018 - time (sec): 52.69 - samples/sec: 2806.21 - lr: 0.000001 - momentum: 0.000000 2023-10-14 01:27:22,660 epoch 10 - iter 990/992 - loss 0.00838705 - time (sec): 58.45 - samples/sec: 2800.71 - lr: 0.000000 - momentum: 0.000000 2023-10-14 01:27:22,767 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:27:22,767 EPOCH 10 done: loss 0.0084 - lr: 0.000000 2023-10-14 01:27:26,208 DEV : loss 0.22556838393211365 - f1-score (micro avg) 0.7641 2023-10-14 01:27:26,232 saving best model 2023-10-14 01:27:27,131 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:27:27,132 Loading model from best epoch ... 2023-10-14 01:27:28,425 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-14 01:27:31,682 Results: - F-score (micro) 0.7925 - F-score (macro) 0.712 - Accuracy 0.6784 By class: precision recall f1-score support LOC 0.8363 0.8656 0.8507 655 PER 0.7336 0.8027 0.7666 223 ORG 0.5536 0.4882 0.5188 127 micro avg 0.7814 0.8040 0.7925 1005 macro avg 0.7078 0.7188 0.7120 1005 weighted avg 0.7778 0.8040 0.7901 1005 2023-10-14 01:27:31,682 ----------------------------------------------------------------------------------------------------