2023-10-13 21:37:23,966 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:37:23,967 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 21:37:23,967 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:37:23,967 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-13 21:37:23,967 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:37:23,967 Train: 7936 sentences 2023-10-13 21:37:23,967 (train_with_dev=False, train_with_test=False) 2023-10-13 21:37:23,967 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:37:23,967 Training Params: 2023-10-13 21:37:23,967 - learning_rate: "5e-05" 2023-10-13 21:37:23,968 - mini_batch_size: "8" 2023-10-13 21:37:23,968 - max_epochs: "10" 2023-10-13 21:37:23,968 - shuffle: "True" 2023-10-13 21:37:23,968 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:37:23,968 Plugins: 2023-10-13 21:37:23,968 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 21:37:23,968 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:37:23,968 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 21:37:23,968 - metric: "('micro avg', 'f1-score')" 2023-10-13 21:37:23,968 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:37:23,968 Computation: 2023-10-13 21:37:23,968 - compute on device: cuda:0 2023-10-13 21:37:23,968 - embedding storage: none 2023-10-13 21:37:23,968 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:37:23,968 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-13 21:37:23,968 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:37:23,968 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:37:29,890 epoch 1 - iter 99/992 - loss 1.91261312 - time (sec): 5.92 - samples/sec: 2715.74 - lr: 0.000005 - momentum: 0.000000 2023-10-13 21:37:35,908 epoch 1 - iter 198/992 - loss 1.13445665 - time (sec): 11.94 - samples/sec: 2727.88 - lr: 0.000010 - momentum: 0.000000 2023-10-13 21:37:41,742 epoch 1 - iter 297/992 - loss 0.83788670 - time (sec): 17.77 - samples/sec: 2767.01 - lr: 0.000015 - momentum: 0.000000 2023-10-13 21:37:47,441 epoch 1 - iter 396/992 - loss 0.67694574 - time (sec): 23.47 - samples/sec: 2781.96 - lr: 0.000020 - momentum: 0.000000 2023-10-13 21:37:53,157 epoch 1 - iter 495/992 - loss 0.57510064 - time (sec): 29.19 - samples/sec: 2789.19 - lr: 0.000025 - momentum: 0.000000 2023-10-13 21:37:58,914 epoch 1 - iter 594/992 - loss 0.50275111 - time (sec): 34.95 - samples/sec: 2791.76 - lr: 0.000030 - momentum: 0.000000 2023-10-13 21:38:04,772 epoch 1 - iter 693/992 - loss 0.44880423 - time (sec): 40.80 - samples/sec: 2810.61 - lr: 0.000035 - momentum: 0.000000 2023-10-13 21:38:11,079 epoch 1 - iter 792/992 - loss 0.40684356 - time (sec): 47.11 - samples/sec: 2805.49 - lr: 0.000040 - momentum: 0.000000 2023-10-13 21:38:16,874 epoch 1 - iter 891/992 - loss 0.37820026 - time (sec): 52.90 - samples/sec: 2799.25 - lr: 0.000045 - momentum: 0.000000 2023-10-13 21:38:22,677 epoch 1 - iter 990/992 - loss 0.35457753 - time (sec): 58.71 - samples/sec: 2789.75 - lr: 0.000050 - momentum: 0.000000 2023-10-13 21:38:22,786 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:38:22,786 EPOCH 1 done: loss 0.3542 - lr: 0.000050 2023-10-13 21:38:26,277 DEV : loss 0.09475447982549667 - f1-score (micro avg) 0.6958 2023-10-13 21:38:26,298 saving best model 2023-10-13 21:38:26,721 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:38:32,814 epoch 2 - iter 99/992 - loss 0.12012769 - time (sec): 6.09 - samples/sec: 2833.35 - lr: 0.000049 - momentum: 0.000000 2023-10-13 21:38:38,526 epoch 2 - iter 198/992 - loss 0.11631000 - time (sec): 11.80 - samples/sec: 2746.93 - lr: 0.000049 - momentum: 0.000000 2023-10-13 21:38:44,938 epoch 2 - iter 297/992 - loss 0.11145763 - time (sec): 18.22 - samples/sec: 2735.81 - lr: 0.000048 - momentum: 0.000000 2023-10-13 21:38:50,662 epoch 2 - iter 396/992 - loss 0.10881005 - time (sec): 23.94 - samples/sec: 2704.74 - lr: 0.000048 - momentum: 0.000000 2023-10-13 21:38:56,694 epoch 2 - iter 495/992 - loss 0.10846778 - time (sec): 29.97 - samples/sec: 2740.68 - lr: 0.000047 - momentum: 0.000000 2023-10-13 21:39:02,441 epoch 2 - iter 594/992 - loss 0.10567866 - time (sec): 35.72 - samples/sec: 2746.65 - lr: 0.000047 - momentum: 0.000000 2023-10-13 21:39:08,610 epoch 2 - iter 693/992 - loss 0.10559153 - time (sec): 41.89 - samples/sec: 2735.81 - lr: 0.000046 - momentum: 0.000000 2023-10-13 21:39:14,672 epoch 2 - iter 792/992 - loss 0.10488835 - time (sec): 47.95 - samples/sec: 2722.53 - lr: 0.000046 - momentum: 0.000000 2023-10-13 21:39:20,740 epoch 2 - iter 891/992 - loss 0.10442316 - time (sec): 54.02 - samples/sec: 2727.03 - lr: 0.000045 - momentum: 0.000000 2023-10-13 21:39:26,803 epoch 2 - iter 990/992 - loss 0.10329774 - time (sec): 60.08 - samples/sec: 2725.62 - lr: 0.000044 - momentum: 0.000000 2023-10-13 21:39:26,916 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:39:26,916 EPOCH 2 done: loss 0.1033 - lr: 0.000044 2023-10-13 21:39:30,341 DEV : loss 0.08205121755599976 - f1-score (micro avg) 0.7129 2023-10-13 21:39:30,361 saving best model 2023-10-13 21:39:30,861 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:39:36,742 epoch 3 - iter 99/992 - loss 0.07082610 - time (sec): 5.87 - samples/sec: 2789.48 - lr: 0.000044 - momentum: 0.000000 2023-10-13 21:39:42,620 epoch 3 - iter 198/992 - loss 0.06839682 - time (sec): 11.75 - samples/sec: 2713.38 - lr: 0.000043 - momentum: 0.000000 2023-10-13 21:39:48,758 epoch 3 - iter 297/992 - loss 0.06531227 - time (sec): 17.89 - samples/sec: 2763.77 - lr: 0.000043 - momentum: 0.000000 2023-10-13 21:39:54,864 epoch 3 - iter 396/992 - loss 0.06974993 - time (sec): 23.99 - samples/sec: 2777.66 - lr: 0.000042 - momentum: 0.000000 2023-10-13 21:40:00,891 epoch 3 - iter 495/992 - loss 0.07037521 - time (sec): 30.02 - samples/sec: 2759.70 - lr: 0.000042 - momentum: 0.000000 2023-10-13 21:40:06,636 epoch 3 - iter 594/992 - loss 0.07072352 - time (sec): 35.77 - samples/sec: 2768.83 - lr: 0.000041 - momentum: 0.000000 2023-10-13 21:40:12,505 epoch 3 - iter 693/992 - loss 0.07026764 - time (sec): 41.64 - samples/sec: 2761.90 - lr: 0.000041 - momentum: 0.000000 2023-10-13 21:40:18,247 epoch 3 - iter 792/992 - loss 0.07260697 - time (sec): 47.38 - samples/sec: 2775.98 - lr: 0.000040 - momentum: 0.000000 2023-10-13 21:40:24,041 epoch 3 - iter 891/992 - loss 0.07256251 - time (sec): 53.17 - samples/sec: 2778.82 - lr: 0.000039 - momentum: 0.000000 2023-10-13 21:40:29,699 epoch 3 - iter 990/992 - loss 0.07225091 - time (sec): 58.83 - samples/sec: 2780.17 - lr: 0.000039 - momentum: 0.000000 2023-10-13 21:40:29,815 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:40:29,815 EPOCH 3 done: loss 0.0722 - lr: 0.000039 2023-10-13 21:40:33,902 DEV : loss 0.10864556580781937 - f1-score (micro avg) 0.758 2023-10-13 21:40:33,939 saving best model 2023-10-13 21:40:34,466 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:40:40,380 epoch 4 - iter 99/992 - loss 0.05150709 - time (sec): 5.91 - samples/sec: 2796.73 - lr: 0.000038 - momentum: 0.000000 2023-10-13 21:40:46,332 epoch 4 - iter 198/992 - loss 0.04958120 - time (sec): 11.86 - samples/sec: 2802.33 - lr: 0.000038 - momentum: 0.000000 2023-10-13 21:40:51,999 epoch 4 - iter 297/992 - loss 0.05125257 - time (sec): 17.53 - samples/sec: 2798.55 - lr: 0.000037 - momentum: 0.000000 2023-10-13 21:40:57,854 epoch 4 - iter 396/992 - loss 0.04922105 - time (sec): 23.39 - samples/sec: 2782.53 - lr: 0.000037 - momentum: 0.000000 2023-10-13 21:41:03,725 epoch 4 - iter 495/992 - loss 0.04834639 - time (sec): 29.26 - samples/sec: 2783.19 - lr: 0.000036 - momentum: 0.000000 2023-10-13 21:41:09,534 epoch 4 - iter 594/992 - loss 0.04924934 - time (sec): 35.07 - samples/sec: 2786.12 - lr: 0.000036 - momentum: 0.000000 2023-10-13 21:41:15,489 epoch 4 - iter 693/992 - loss 0.04948734 - time (sec): 41.02 - samples/sec: 2778.64 - lr: 0.000035 - momentum: 0.000000 2023-10-13 21:41:21,333 epoch 4 - iter 792/992 - loss 0.04908108 - time (sec): 46.87 - samples/sec: 2773.09 - lr: 0.000034 - momentum: 0.000000 2023-10-13 21:41:27,452 epoch 4 - iter 891/992 - loss 0.04886483 - time (sec): 52.98 - samples/sec: 2765.98 - lr: 0.000034 - momentum: 0.000000 2023-10-13 21:41:33,681 epoch 4 - iter 990/992 - loss 0.04998705 - time (sec): 59.21 - samples/sec: 2765.34 - lr: 0.000033 - momentum: 0.000000 2023-10-13 21:41:33,794 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:41:33,794 EPOCH 4 done: loss 0.0499 - lr: 0.000033 2023-10-13 21:41:37,201 DEV : loss 0.11597025394439697 - f1-score (micro avg) 0.7675 2023-10-13 21:41:37,222 saving best model 2023-10-13 21:41:37,774 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:41:43,640 epoch 5 - iter 99/992 - loss 0.04977178 - time (sec): 5.86 - samples/sec: 2838.67 - lr: 0.000033 - momentum: 0.000000 2023-10-13 21:41:49,509 epoch 5 - iter 198/992 - loss 0.04238345 - time (sec): 11.73 - samples/sec: 2809.15 - lr: 0.000032 - momentum: 0.000000 2023-10-13 21:41:55,313 epoch 5 - iter 297/992 - loss 0.04044960 - time (sec): 17.53 - samples/sec: 2809.68 - lr: 0.000032 - momentum: 0.000000 2023-10-13 21:42:01,758 epoch 5 - iter 396/992 - loss 0.03908575 - time (sec): 23.98 - samples/sec: 2780.41 - lr: 0.000031 - momentum: 0.000000 2023-10-13 21:42:07,657 epoch 5 - iter 495/992 - loss 0.03930276 - time (sec): 29.88 - samples/sec: 2784.79 - lr: 0.000031 - momentum: 0.000000 2023-10-13 21:42:13,458 epoch 5 - iter 594/992 - loss 0.03815672 - time (sec): 35.68 - samples/sec: 2781.01 - lr: 0.000030 - momentum: 0.000000 2023-10-13 21:42:19,292 epoch 5 - iter 693/992 - loss 0.03945281 - time (sec): 41.51 - samples/sec: 2772.88 - lr: 0.000029 - momentum: 0.000000 2023-10-13 21:42:24,988 epoch 5 - iter 792/992 - loss 0.03857818 - time (sec): 47.21 - samples/sec: 2787.22 - lr: 0.000029 - momentum: 0.000000 2023-10-13 21:42:31,007 epoch 5 - iter 891/992 - loss 0.03988100 - time (sec): 53.23 - samples/sec: 2781.52 - lr: 0.000028 - momentum: 0.000000 2023-10-13 21:42:36,766 epoch 5 - iter 990/992 - loss 0.04081192 - time (sec): 58.99 - samples/sec: 2773.33 - lr: 0.000028 - momentum: 0.000000 2023-10-13 21:42:36,903 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:42:36,903 EPOCH 5 done: loss 0.0408 - lr: 0.000028 2023-10-13 21:42:41,542 DEV : loss 0.13763324916362762 - f1-score (micro avg) 0.7589 2023-10-13 21:42:41,580 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:42:47,750 epoch 6 - iter 99/992 - loss 0.02422172 - time (sec): 6.17 - samples/sec: 2688.98 - lr: 0.000027 - momentum: 0.000000 2023-10-13 21:42:53,863 epoch 6 - iter 198/992 - loss 0.02863550 - time (sec): 12.28 - samples/sec: 2753.55 - lr: 0.000027 - momentum: 0.000000 2023-10-13 21:42:59,675 epoch 6 - iter 297/992 - loss 0.02988685 - time (sec): 18.09 - samples/sec: 2746.20 - lr: 0.000026 - momentum: 0.000000 2023-10-13 21:43:05,663 epoch 6 - iter 396/992 - loss 0.02833903 - time (sec): 24.08 - samples/sec: 2722.44 - lr: 0.000026 - momentum: 0.000000 2023-10-13 21:43:11,508 epoch 6 - iter 495/992 - loss 0.02786454 - time (sec): 29.93 - samples/sec: 2737.83 - lr: 0.000025 - momentum: 0.000000 2023-10-13 21:43:17,288 epoch 6 - iter 594/992 - loss 0.02866839 - time (sec): 35.71 - samples/sec: 2755.15 - lr: 0.000024 - momentum: 0.000000 2023-10-13 21:43:23,408 epoch 6 - iter 693/992 - loss 0.02806587 - time (sec): 41.83 - samples/sec: 2753.01 - lr: 0.000024 - momentum: 0.000000 2023-10-13 21:43:29,273 epoch 6 - iter 792/992 - loss 0.02884132 - time (sec): 47.69 - samples/sec: 2755.88 - lr: 0.000023 - momentum: 0.000000 2023-10-13 21:43:35,118 epoch 6 - iter 891/992 - loss 0.02883868 - time (sec): 53.54 - samples/sec: 2753.63 - lr: 0.000023 - momentum: 0.000000 2023-10-13 21:43:41,138 epoch 6 - iter 990/992 - loss 0.02863755 - time (sec): 59.56 - samples/sec: 2748.21 - lr: 0.000022 - momentum: 0.000000 2023-10-13 21:43:41,250 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:43:41,250 EPOCH 6 done: loss 0.0288 - lr: 0.000022 2023-10-13 21:43:44,796 DEV : loss 0.17916934192180634 - f1-score (micro avg) 0.7563 2023-10-13 21:43:44,821 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:43:50,755 epoch 7 - iter 99/992 - loss 0.01996214 - time (sec): 5.93 - samples/sec: 2590.21 - lr: 0.000022 - momentum: 0.000000 2023-10-13 21:43:57,458 epoch 7 - iter 198/992 - loss 0.02277291 - time (sec): 12.63 - samples/sec: 2548.91 - lr: 0.000021 - momentum: 0.000000 2023-10-13 21:44:03,311 epoch 7 - iter 297/992 - loss 0.02180867 - time (sec): 18.49 - samples/sec: 2584.45 - lr: 0.000021 - momentum: 0.000000 2023-10-13 21:44:09,225 epoch 7 - iter 396/992 - loss 0.02147002 - time (sec): 24.40 - samples/sec: 2621.14 - lr: 0.000020 - momentum: 0.000000 2023-10-13 21:44:15,345 epoch 7 - iter 495/992 - loss 0.02147281 - time (sec): 30.52 - samples/sec: 2658.70 - lr: 0.000019 - momentum: 0.000000 2023-10-13 21:44:21,676 epoch 7 - iter 594/992 - loss 0.02278340 - time (sec): 36.85 - samples/sec: 2672.02 - lr: 0.000019 - momentum: 0.000000 2023-10-13 21:44:27,314 epoch 7 - iter 693/992 - loss 0.02302071 - time (sec): 42.49 - samples/sec: 2675.36 - lr: 0.000018 - momentum: 0.000000 2023-10-13 21:44:33,289 epoch 7 - iter 792/992 - loss 0.02238297 - time (sec): 48.47 - samples/sec: 2696.36 - lr: 0.000018 - momentum: 0.000000 2023-10-13 21:44:38,932 epoch 7 - iter 891/992 - loss 0.02194504 - time (sec): 54.11 - samples/sec: 2717.35 - lr: 0.000017 - momentum: 0.000000 2023-10-13 21:44:44,806 epoch 7 - iter 990/992 - loss 0.02178032 - time (sec): 59.98 - samples/sec: 2727.36 - lr: 0.000017 - momentum: 0.000000 2023-10-13 21:44:44,942 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:44:44,942 EPOCH 7 done: loss 0.0217 - lr: 0.000017 2023-10-13 21:44:48,386 DEV : loss 0.19963695108890533 - f1-score (micro avg) 0.7511 2023-10-13 21:44:48,411 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:44:54,512 epoch 8 - iter 99/992 - loss 0.00802566 - time (sec): 6.10 - samples/sec: 2678.43 - lr: 0.000016 - momentum: 0.000000 2023-10-13 21:45:00,449 epoch 8 - iter 198/992 - loss 0.01380324 - time (sec): 12.04 - samples/sec: 2738.01 - lr: 0.000016 - momentum: 0.000000 2023-10-13 21:45:06,122 epoch 8 - iter 297/992 - loss 0.01227669 - time (sec): 17.71 - samples/sec: 2757.67 - lr: 0.000015 - momentum: 0.000000 2023-10-13 21:45:12,518 epoch 8 - iter 396/992 - loss 0.01400552 - time (sec): 24.11 - samples/sec: 2729.93 - lr: 0.000014 - momentum: 0.000000 2023-10-13 21:45:18,333 epoch 8 - iter 495/992 - loss 0.01475903 - time (sec): 29.92 - samples/sec: 2736.32 - lr: 0.000014 - momentum: 0.000000 2023-10-13 21:45:23,984 epoch 8 - iter 594/992 - loss 0.01530820 - time (sec): 35.57 - samples/sec: 2735.01 - lr: 0.000013 - momentum: 0.000000 2023-10-13 21:45:30,069 epoch 8 - iter 693/992 - loss 0.01589983 - time (sec): 41.66 - samples/sec: 2736.91 - lr: 0.000013 - momentum: 0.000000 2023-10-13 21:45:36,359 epoch 8 - iter 792/992 - loss 0.01603725 - time (sec): 47.95 - samples/sec: 2744.42 - lr: 0.000012 - momentum: 0.000000 2023-10-13 21:45:42,042 epoch 8 - iter 891/992 - loss 0.01533557 - time (sec): 53.63 - samples/sec: 2749.62 - lr: 0.000012 - momentum: 0.000000 2023-10-13 21:45:47,922 epoch 8 - iter 990/992 - loss 0.01568211 - time (sec): 59.51 - samples/sec: 2751.39 - lr: 0.000011 - momentum: 0.000000 2023-10-13 21:45:48,031 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:45:48,031 EPOCH 8 done: loss 0.0157 - lr: 0.000011 2023-10-13 21:45:51,538 DEV : loss 0.2106373906135559 - f1-score (micro avg) 0.7511 2023-10-13 21:45:51,560 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:45:57,529 epoch 9 - iter 99/992 - loss 0.00887012 - time (sec): 5.97 - samples/sec: 2806.18 - lr: 0.000011 - momentum: 0.000000 2023-10-13 21:46:03,489 epoch 9 - iter 198/992 - loss 0.00899000 - time (sec): 11.93 - samples/sec: 2748.69 - lr: 0.000010 - momentum: 0.000000 2023-10-13 21:46:09,616 epoch 9 - iter 297/992 - loss 0.00742884 - time (sec): 18.05 - samples/sec: 2694.16 - lr: 0.000009 - momentum: 0.000000 2023-10-13 21:46:15,833 epoch 9 - iter 396/992 - loss 0.00794530 - time (sec): 24.27 - samples/sec: 2710.07 - lr: 0.000009 - momentum: 0.000000 2023-10-13 21:46:21,629 epoch 9 - iter 495/992 - loss 0.00848285 - time (sec): 30.07 - samples/sec: 2744.72 - lr: 0.000008 - momentum: 0.000000 2023-10-13 21:46:27,363 epoch 9 - iter 594/992 - loss 0.00917051 - time (sec): 35.80 - samples/sec: 2739.94 - lr: 0.000008 - momentum: 0.000000 2023-10-13 21:46:32,985 epoch 9 - iter 693/992 - loss 0.00939071 - time (sec): 41.42 - samples/sec: 2755.95 - lr: 0.000007 - momentum: 0.000000 2023-10-13 21:46:38,976 epoch 9 - iter 792/992 - loss 0.01010125 - time (sec): 47.41 - samples/sec: 2762.85 - lr: 0.000007 - momentum: 0.000000 2023-10-13 21:46:45,090 epoch 9 - iter 891/992 - loss 0.01075447 - time (sec): 53.53 - samples/sec: 2761.88 - lr: 0.000006 - momentum: 0.000000 2023-10-13 21:46:50,883 epoch 9 - iter 990/992 - loss 0.01068750 - time (sec): 59.32 - samples/sec: 2757.49 - lr: 0.000006 - momentum: 0.000000 2023-10-13 21:46:51,018 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:46:51,018 EPOCH 9 done: loss 0.0107 - lr: 0.000006 2023-10-13 21:46:54,559 DEV : loss 0.22137963771820068 - f1-score (micro avg) 0.7493 2023-10-13 21:46:54,581 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:47:00,669 epoch 10 - iter 99/992 - loss 0.00885115 - time (sec): 6.09 - samples/sec: 2800.45 - lr: 0.000005 - momentum: 0.000000 2023-10-13 21:47:06,567 epoch 10 - iter 198/992 - loss 0.00701261 - time (sec): 11.98 - samples/sec: 2734.37 - lr: 0.000004 - momentum: 0.000000 2023-10-13 21:47:12,490 epoch 10 - iter 297/992 - loss 0.00709994 - time (sec): 17.91 - samples/sec: 2725.21 - lr: 0.000004 - momentum: 0.000000 2023-10-13 21:47:19,212 epoch 10 - iter 396/992 - loss 0.00713180 - time (sec): 24.63 - samples/sec: 2657.26 - lr: 0.000003 - momentum: 0.000000 2023-10-13 21:47:24,800 epoch 10 - iter 495/992 - loss 0.00712523 - time (sec): 30.22 - samples/sec: 2696.43 - lr: 0.000003 - momentum: 0.000000 2023-10-13 21:47:30,657 epoch 10 - iter 594/992 - loss 0.00752130 - time (sec): 36.07 - samples/sec: 2715.70 - lr: 0.000002 - momentum: 0.000000 2023-10-13 21:47:36,606 epoch 10 - iter 693/992 - loss 0.00756143 - time (sec): 42.02 - samples/sec: 2720.45 - lr: 0.000002 - momentum: 0.000000 2023-10-13 21:47:42,502 epoch 10 - iter 792/992 - loss 0.00745144 - time (sec): 47.92 - samples/sec: 2732.37 - lr: 0.000001 - momentum: 0.000000 2023-10-13 21:47:48,499 epoch 10 - iter 891/992 - loss 0.00734368 - time (sec): 53.92 - samples/sec: 2744.99 - lr: 0.000001 - momentum: 0.000000 2023-10-13 21:47:54,258 epoch 10 - iter 990/992 - loss 0.00723729 - time (sec): 59.68 - samples/sec: 2742.25 - lr: 0.000000 - momentum: 0.000000 2023-10-13 21:47:54,381 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:47:54,381 EPOCH 10 done: loss 0.0072 - lr: 0.000000 2023-10-13 21:47:57,909 DEV : loss 0.22665657103061676 - f1-score (micro avg) 0.7513 2023-10-13 21:47:58,383 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:47:58,384 Loading model from best epoch ... 2023-10-13 21:47:59,826 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-13 21:48:03,320 Results: - F-score (micro) 0.7736 - F-score (macro) 0.6613 - Accuracy 0.6522 By class: precision recall f1-score support LOC 0.7826 0.8794 0.8282 655 PER 0.8556 0.6906 0.7643 223 ORG 0.5968 0.2913 0.3915 127 micro avg 0.7843 0.7632 0.7736 1005 macro avg 0.7450 0.6204 0.6613 1005 weighted avg 0.7753 0.7632 0.7588 1005 2023-10-13 21:48:03,320 ----------------------------------------------------------------------------------------------------