stefan-it's picture
Upload folder using huggingface_hub
62cc8c2
2023-10-13 21:26:19,181 ----------------------------------------------------------------------------------------------------
2023-10-13 21:26:19,182 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 21:26:19,182 ----------------------------------------------------------------------------------------------------
2023-10-13 21:26:19,182 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-13 21:26:19,182 ----------------------------------------------------------------------------------------------------
2023-10-13 21:26:19,182 Train: 7936 sentences
2023-10-13 21:26:19,182 (train_with_dev=False, train_with_test=False)
2023-10-13 21:26:19,182 ----------------------------------------------------------------------------------------------------
2023-10-13 21:26:19,182 Training Params:
2023-10-13 21:26:19,182 - learning_rate: "3e-05"
2023-10-13 21:26:19,182 - mini_batch_size: "8"
2023-10-13 21:26:19,182 - max_epochs: "10"
2023-10-13 21:26:19,182 - shuffle: "True"
2023-10-13 21:26:19,182 ----------------------------------------------------------------------------------------------------
2023-10-13 21:26:19,182 Plugins:
2023-10-13 21:26:19,182 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 21:26:19,182 ----------------------------------------------------------------------------------------------------
2023-10-13 21:26:19,182 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 21:26:19,182 - metric: "('micro avg', 'f1-score')"
2023-10-13 21:26:19,182 ----------------------------------------------------------------------------------------------------
2023-10-13 21:26:19,182 Computation:
2023-10-13 21:26:19,182 - compute on device: cuda:0
2023-10-13 21:26:19,182 - embedding storage: none
2023-10-13 21:26:19,182 ----------------------------------------------------------------------------------------------------
2023-10-13 21:26:19,183 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-13 21:26:19,183 ----------------------------------------------------------------------------------------------------
2023-10-13 21:26:19,183 ----------------------------------------------------------------------------------------------------
2023-10-13 21:26:25,065 epoch 1 - iter 99/992 - loss 2.28557417 - time (sec): 5.88 - samples/sec: 2733.74 - lr: 0.000003 - momentum: 0.000000
2023-10-13 21:26:31,062 epoch 1 - iter 198/992 - loss 1.39151134 - time (sec): 11.88 - samples/sec: 2741.70 - lr: 0.000006 - momentum: 0.000000
2023-10-13 21:26:36,969 epoch 1 - iter 297/992 - loss 1.02546157 - time (sec): 17.79 - samples/sec: 2764.99 - lr: 0.000009 - momentum: 0.000000
2023-10-13 21:26:42,777 epoch 1 - iter 396/992 - loss 0.82409225 - time (sec): 23.59 - samples/sec: 2767.68 - lr: 0.000012 - momentum: 0.000000
2023-10-13 21:26:48,467 epoch 1 - iter 495/992 - loss 0.69589370 - time (sec): 29.28 - samples/sec: 2780.04 - lr: 0.000015 - momentum: 0.000000
2023-10-13 21:26:54,198 epoch 1 - iter 594/992 - loss 0.60583632 - time (sec): 35.01 - samples/sec: 2786.20 - lr: 0.000018 - momentum: 0.000000
2023-10-13 21:27:00,081 epoch 1 - iter 693/992 - loss 0.53770132 - time (sec): 40.90 - samples/sec: 2804.16 - lr: 0.000021 - momentum: 0.000000
2023-10-13 21:27:06,207 epoch 1 - iter 792/992 - loss 0.48422796 - time (sec): 47.02 - samples/sec: 2810.64 - lr: 0.000024 - momentum: 0.000000
2023-10-13 21:27:12,012 epoch 1 - iter 891/992 - loss 0.44737479 - time (sec): 52.83 - samples/sec: 2803.31 - lr: 0.000027 - momentum: 0.000000
2023-10-13 21:27:18,204 epoch 1 - iter 990/992 - loss 0.41710425 - time (sec): 59.02 - samples/sec: 2774.99 - lr: 0.000030 - momentum: 0.000000
2023-10-13 21:27:18,322 ----------------------------------------------------------------------------------------------------
2023-10-13 21:27:18,322 EPOCH 1 done: loss 0.4167 - lr: 0.000030
2023-10-13 21:27:21,503 DEV : loss 0.09923805296421051 - f1-score (micro avg) 0.7086
2023-10-13 21:27:21,525 saving best model
2023-10-13 21:27:21,981 ----------------------------------------------------------------------------------------------------
2023-10-13 21:27:28,019 epoch 2 - iter 99/992 - loss 0.11613135 - time (sec): 6.04 - samples/sec: 2859.34 - lr: 0.000030 - momentum: 0.000000
2023-10-13 21:27:33,706 epoch 2 - iter 198/992 - loss 0.11269936 - time (sec): 11.72 - samples/sec: 2765.83 - lr: 0.000029 - momentum: 0.000000
2023-10-13 21:27:40,128 epoch 2 - iter 297/992 - loss 0.10852299 - time (sec): 18.14 - samples/sec: 2746.57 - lr: 0.000029 - momentum: 0.000000
2023-10-13 21:27:45,920 epoch 2 - iter 396/992 - loss 0.10649798 - time (sec): 23.94 - samples/sec: 2705.05 - lr: 0.000029 - momentum: 0.000000
2023-10-13 21:27:51,989 epoch 2 - iter 495/992 - loss 0.10652407 - time (sec): 30.01 - samples/sec: 2737.62 - lr: 0.000028 - momentum: 0.000000
2023-10-13 21:27:57,928 epoch 2 - iter 594/992 - loss 0.10483546 - time (sec): 35.94 - samples/sec: 2729.34 - lr: 0.000028 - momentum: 0.000000
2023-10-13 21:28:04,046 epoch 2 - iter 693/992 - loss 0.10404519 - time (sec): 42.06 - samples/sec: 2724.36 - lr: 0.000028 - momentum: 0.000000
2023-10-13 21:28:09,845 epoch 2 - iter 792/992 - loss 0.10351903 - time (sec): 47.86 - samples/sec: 2727.54 - lr: 0.000027 - momentum: 0.000000
2023-10-13 21:28:15,662 epoch 2 - iter 891/992 - loss 0.10258918 - time (sec): 53.68 - samples/sec: 2744.25 - lr: 0.000027 - momentum: 0.000000
2023-10-13 21:28:21,494 epoch 2 - iter 990/992 - loss 0.10133427 - time (sec): 59.51 - samples/sec: 2751.72 - lr: 0.000027 - momentum: 0.000000
2023-10-13 21:28:21,607 ----------------------------------------------------------------------------------------------------
2023-10-13 21:28:21,607 EPOCH 2 done: loss 0.1013 - lr: 0.000027
2023-10-13 21:28:25,515 DEV : loss 0.0804639682173729 - f1-score (micro avg) 0.7385
2023-10-13 21:28:25,535 saving best model
2023-10-13 21:28:26,011 ----------------------------------------------------------------------------------------------------
2023-10-13 21:28:31,880 epoch 3 - iter 99/992 - loss 0.06775603 - time (sec): 5.87 - samples/sec: 2791.67 - lr: 0.000026 - momentum: 0.000000
2023-10-13 21:28:37,828 epoch 3 - iter 198/992 - loss 0.06948443 - time (sec): 11.82 - samples/sec: 2698.32 - lr: 0.000026 - momentum: 0.000000
2023-10-13 21:28:43,969 epoch 3 - iter 297/992 - loss 0.06447600 - time (sec): 17.96 - samples/sec: 2753.14 - lr: 0.000026 - momentum: 0.000000
2023-10-13 21:28:50,058 epoch 3 - iter 396/992 - loss 0.06751946 - time (sec): 24.05 - samples/sec: 2771.64 - lr: 0.000025 - momentum: 0.000000
2023-10-13 21:28:55,993 epoch 3 - iter 495/992 - loss 0.06685992 - time (sec): 29.98 - samples/sec: 2763.41 - lr: 0.000025 - momentum: 0.000000
2023-10-13 21:29:01,692 epoch 3 - iter 594/992 - loss 0.06777144 - time (sec): 35.68 - samples/sec: 2775.56 - lr: 0.000025 - momentum: 0.000000
2023-10-13 21:29:07,475 epoch 3 - iter 693/992 - loss 0.06708595 - time (sec): 41.46 - samples/sec: 2773.37 - lr: 0.000024 - momentum: 0.000000
2023-10-13 21:29:13,244 epoch 3 - iter 792/992 - loss 0.06891828 - time (sec): 47.23 - samples/sec: 2784.55 - lr: 0.000024 - momentum: 0.000000
2023-10-13 21:29:19,038 epoch 3 - iter 891/992 - loss 0.06968673 - time (sec): 53.03 - samples/sec: 2786.41 - lr: 0.000024 - momentum: 0.000000
2023-10-13 21:29:24,682 epoch 3 - iter 990/992 - loss 0.06906410 - time (sec): 58.67 - samples/sec: 2787.68 - lr: 0.000023 - momentum: 0.000000
2023-10-13 21:29:24,800 ----------------------------------------------------------------------------------------------------
2023-10-13 21:29:24,801 EPOCH 3 done: loss 0.0690 - lr: 0.000023
2023-10-13 21:29:28,299 DEV : loss 0.1052832305431366 - f1-score (micro avg) 0.7509
2023-10-13 21:29:28,329 saving best model
2023-10-13 21:29:28,851 ----------------------------------------------------------------------------------------------------
2023-10-13 21:29:35,125 epoch 4 - iter 99/992 - loss 0.04809586 - time (sec): 6.27 - samples/sec: 2636.57 - lr: 0.000023 - momentum: 0.000000
2023-10-13 21:29:41,100 epoch 4 - iter 198/992 - loss 0.04947603 - time (sec): 12.25 - samples/sec: 2714.99 - lr: 0.000023 - momentum: 0.000000
2023-10-13 21:29:46,795 epoch 4 - iter 297/992 - loss 0.05118762 - time (sec): 17.94 - samples/sec: 2734.62 - lr: 0.000022 - momentum: 0.000000
2023-10-13 21:29:52,796 epoch 4 - iter 396/992 - loss 0.04965416 - time (sec): 23.94 - samples/sec: 2717.92 - lr: 0.000022 - momentum: 0.000000
2023-10-13 21:29:58,771 epoch 4 - iter 495/992 - loss 0.04848418 - time (sec): 29.92 - samples/sec: 2721.76 - lr: 0.000022 - momentum: 0.000000
2023-10-13 21:30:04,653 epoch 4 - iter 594/992 - loss 0.04858443 - time (sec): 35.80 - samples/sec: 2729.07 - lr: 0.000021 - momentum: 0.000000
2023-10-13 21:30:10,562 epoch 4 - iter 693/992 - loss 0.04967146 - time (sec): 41.71 - samples/sec: 2732.89 - lr: 0.000021 - momentum: 0.000000
2023-10-13 21:30:16,426 epoch 4 - iter 792/992 - loss 0.04938520 - time (sec): 47.57 - samples/sec: 2731.92 - lr: 0.000021 - momentum: 0.000000
2023-10-13 21:30:22,330 epoch 4 - iter 891/992 - loss 0.04876000 - time (sec): 53.48 - samples/sec: 2740.55 - lr: 0.000020 - momentum: 0.000000
2023-10-13 21:30:28,272 epoch 4 - iter 990/992 - loss 0.04971305 - time (sec): 59.42 - samples/sec: 2755.83 - lr: 0.000020 - momentum: 0.000000
2023-10-13 21:30:28,383 ----------------------------------------------------------------------------------------------------
2023-10-13 21:30:28,383 EPOCH 4 done: loss 0.0497 - lr: 0.000020
2023-10-13 21:30:31,897 DEV : loss 0.12055560946464539 - f1-score (micro avg) 0.7455
2023-10-13 21:30:31,919 ----------------------------------------------------------------------------------------------------
2023-10-13 21:30:37,673 epoch 5 - iter 99/992 - loss 0.04227405 - time (sec): 5.75 - samples/sec: 2891.99 - lr: 0.000020 - momentum: 0.000000
2023-10-13 21:30:43,513 epoch 5 - iter 198/992 - loss 0.03608106 - time (sec): 11.59 - samples/sec: 2842.42 - lr: 0.000019 - momentum: 0.000000
2023-10-13 21:30:49,376 epoch 5 - iter 297/992 - loss 0.03886863 - time (sec): 17.46 - samples/sec: 2822.33 - lr: 0.000019 - momentum: 0.000000
2023-10-13 21:30:56,245 epoch 5 - iter 396/992 - loss 0.03816738 - time (sec): 24.32 - samples/sec: 2740.89 - lr: 0.000019 - momentum: 0.000000
2023-10-13 21:31:02,116 epoch 5 - iter 495/992 - loss 0.03804566 - time (sec): 30.20 - samples/sec: 2755.43 - lr: 0.000018 - momentum: 0.000000
2023-10-13 21:31:07,852 epoch 5 - iter 594/992 - loss 0.03689071 - time (sec): 35.93 - samples/sec: 2761.47 - lr: 0.000018 - momentum: 0.000000
2023-10-13 21:31:13,786 epoch 5 - iter 693/992 - loss 0.03872646 - time (sec): 41.87 - samples/sec: 2749.53 - lr: 0.000018 - momentum: 0.000000
2023-10-13 21:31:19,687 epoch 5 - iter 792/992 - loss 0.03754210 - time (sec): 47.77 - samples/sec: 2754.69 - lr: 0.000017 - momentum: 0.000000
2023-10-13 21:31:25,930 epoch 5 - iter 891/992 - loss 0.03755859 - time (sec): 54.01 - samples/sec: 2741.27 - lr: 0.000017 - momentum: 0.000000
2023-10-13 21:31:31,815 epoch 5 - iter 990/992 - loss 0.03809129 - time (sec): 59.89 - samples/sec: 2731.36 - lr: 0.000017 - momentum: 0.000000
2023-10-13 21:31:31,954 ----------------------------------------------------------------------------------------------------
2023-10-13 21:31:31,954 EPOCH 5 done: loss 0.0381 - lr: 0.000017
2023-10-13 21:31:35,520 DEV : loss 0.13742151856422424 - f1-score (micro avg) 0.7587
2023-10-13 21:31:35,544 saving best model
2023-10-13 21:31:36,125 ----------------------------------------------------------------------------------------------------
2023-10-13 21:31:42,337 epoch 6 - iter 99/992 - loss 0.02712909 - time (sec): 6.21 - samples/sec: 2670.73 - lr: 0.000016 - momentum: 0.000000
2023-10-13 21:31:48,728 epoch 6 - iter 198/992 - loss 0.02785322 - time (sec): 12.60 - samples/sec: 2683.50 - lr: 0.000016 - momentum: 0.000000
2023-10-13 21:31:54,715 epoch 6 - iter 297/992 - loss 0.02848273 - time (sec): 18.59 - samples/sec: 2672.94 - lr: 0.000016 - momentum: 0.000000
2023-10-13 21:32:00,706 epoch 6 - iter 396/992 - loss 0.02785739 - time (sec): 24.58 - samples/sec: 2667.29 - lr: 0.000015 - momentum: 0.000000
2023-10-13 21:32:06,591 epoch 6 - iter 495/992 - loss 0.02751215 - time (sec): 30.46 - samples/sec: 2689.46 - lr: 0.000015 - momentum: 0.000000
2023-10-13 21:32:12,351 epoch 6 - iter 594/992 - loss 0.02798228 - time (sec): 36.22 - samples/sec: 2715.72 - lr: 0.000015 - momentum: 0.000000
2023-10-13 21:32:18,302 epoch 6 - iter 693/992 - loss 0.02829397 - time (sec): 42.18 - samples/sec: 2730.18 - lr: 0.000014 - momentum: 0.000000
2023-10-13 21:32:24,156 epoch 6 - iter 792/992 - loss 0.02864477 - time (sec): 48.03 - samples/sec: 2736.41 - lr: 0.000014 - momentum: 0.000000
2023-10-13 21:32:30,000 epoch 6 - iter 891/992 - loss 0.02875296 - time (sec): 53.87 - samples/sec: 2736.42 - lr: 0.000014 - momentum: 0.000000
2023-10-13 21:32:36,022 epoch 6 - iter 990/992 - loss 0.02942813 - time (sec): 59.90 - samples/sec: 2732.62 - lr: 0.000013 - momentum: 0.000000
2023-10-13 21:32:36,132 ----------------------------------------------------------------------------------------------------
2023-10-13 21:32:36,132 EPOCH 6 done: loss 0.0295 - lr: 0.000013
2023-10-13 21:32:39,605 DEV : loss 0.16862468421459198 - f1-score (micro avg) 0.7574
2023-10-13 21:32:39,626 ----------------------------------------------------------------------------------------------------
2023-10-13 21:32:45,397 epoch 7 - iter 99/992 - loss 0.02084181 - time (sec): 5.77 - samples/sec: 2663.18 - lr: 0.000013 - momentum: 0.000000
2023-10-13 21:32:51,692 epoch 7 - iter 198/992 - loss 0.02364712 - time (sec): 12.07 - samples/sec: 2669.23 - lr: 0.000013 - momentum: 0.000000
2023-10-13 21:32:57,470 epoch 7 - iter 297/992 - loss 0.02055440 - time (sec): 17.84 - samples/sec: 2677.99 - lr: 0.000012 - momentum: 0.000000
2023-10-13 21:33:03,230 epoch 7 - iter 396/992 - loss 0.02123931 - time (sec): 23.60 - samples/sec: 2709.80 - lr: 0.000012 - momentum: 0.000000
2023-10-13 21:33:09,233 epoch 7 - iter 495/992 - loss 0.02161191 - time (sec): 29.61 - samples/sec: 2740.96 - lr: 0.000012 - momentum: 0.000000
2023-10-13 21:33:15,904 epoch 7 - iter 594/992 - loss 0.02114814 - time (sec): 36.28 - samples/sec: 2714.40 - lr: 0.000011 - momentum: 0.000000
2023-10-13 21:33:21,533 epoch 7 - iter 693/992 - loss 0.02119562 - time (sec): 41.91 - samples/sec: 2712.74 - lr: 0.000011 - momentum: 0.000000
2023-10-13 21:33:27,488 epoch 7 - iter 792/992 - loss 0.02088100 - time (sec): 47.86 - samples/sec: 2730.44 - lr: 0.000011 - momentum: 0.000000
2023-10-13 21:33:33,206 epoch 7 - iter 891/992 - loss 0.02187657 - time (sec): 53.58 - samples/sec: 2744.28 - lr: 0.000010 - momentum: 0.000000
2023-10-13 21:33:38,997 epoch 7 - iter 990/992 - loss 0.02186821 - time (sec): 59.37 - samples/sec: 2755.54 - lr: 0.000010 - momentum: 0.000000
2023-10-13 21:33:39,128 ----------------------------------------------------------------------------------------------------
2023-10-13 21:33:39,129 EPOCH 7 done: loss 0.0218 - lr: 0.000010
2023-10-13 21:33:42,635 DEV : loss 0.18575911223888397 - f1-score (micro avg) 0.7592
2023-10-13 21:33:42,667 saving best model
2023-10-13 21:33:43,180 ----------------------------------------------------------------------------------------------------
2023-10-13 21:33:49,709 epoch 8 - iter 99/992 - loss 0.01242652 - time (sec): 6.52 - samples/sec: 2504.77 - lr: 0.000010 - momentum: 0.000000
2023-10-13 21:33:55,885 epoch 8 - iter 198/992 - loss 0.01492552 - time (sec): 12.70 - samples/sec: 2595.31 - lr: 0.000009 - momentum: 0.000000
2023-10-13 21:34:01,748 epoch 8 - iter 297/992 - loss 0.01415991 - time (sec): 18.56 - samples/sec: 2631.07 - lr: 0.000009 - momentum: 0.000000
2023-10-13 21:34:07,558 epoch 8 - iter 396/992 - loss 0.01548353 - time (sec): 24.37 - samples/sec: 2700.18 - lr: 0.000009 - momentum: 0.000000
2023-10-13 21:34:13,303 epoch 8 - iter 495/992 - loss 0.01575741 - time (sec): 30.12 - samples/sec: 2718.52 - lr: 0.000008 - momentum: 0.000000
2023-10-13 21:34:18,907 epoch 8 - iter 594/992 - loss 0.01594169 - time (sec): 35.72 - samples/sec: 2723.58 - lr: 0.000008 - momentum: 0.000000
2023-10-13 21:34:24,857 epoch 8 - iter 693/992 - loss 0.01654981 - time (sec): 41.67 - samples/sec: 2735.99 - lr: 0.000008 - momentum: 0.000000
2023-10-13 21:34:31,096 epoch 8 - iter 792/992 - loss 0.01587979 - time (sec): 47.91 - samples/sec: 2746.56 - lr: 0.000007 - momentum: 0.000000
2023-10-13 21:34:36,972 epoch 8 - iter 891/992 - loss 0.01575634 - time (sec): 53.79 - samples/sec: 2741.67 - lr: 0.000007 - momentum: 0.000000
2023-10-13 21:34:42,829 epoch 8 - iter 990/992 - loss 0.01614502 - time (sec): 59.64 - samples/sec: 2745.24 - lr: 0.000007 - momentum: 0.000000
2023-10-13 21:34:42,936 ----------------------------------------------------------------------------------------------------
2023-10-13 21:34:42,936 EPOCH 8 done: loss 0.0161 - lr: 0.000007
2023-10-13 21:34:46,404 DEV : loss 0.20498403906822205 - f1-score (micro avg) 0.7587
2023-10-13 21:34:46,426 ----------------------------------------------------------------------------------------------------
2023-10-13 21:34:52,358 epoch 9 - iter 99/992 - loss 0.01356369 - time (sec): 5.93 - samples/sec: 2823.61 - lr: 0.000006 - momentum: 0.000000
2023-10-13 21:34:58,266 epoch 9 - iter 198/992 - loss 0.01060758 - time (sec): 11.84 - samples/sec: 2769.20 - lr: 0.000006 - momentum: 0.000000
2023-10-13 21:35:03,917 epoch 9 - iter 297/992 - loss 0.01135131 - time (sec): 17.49 - samples/sec: 2781.11 - lr: 0.000006 - momentum: 0.000000
2023-10-13 21:35:09,966 epoch 9 - iter 396/992 - loss 0.01091601 - time (sec): 23.54 - samples/sec: 2794.35 - lr: 0.000005 - momentum: 0.000000
2023-10-13 21:35:15,776 epoch 9 - iter 495/992 - loss 0.01147457 - time (sec): 29.35 - samples/sec: 2811.81 - lr: 0.000005 - momentum: 0.000000
2023-10-13 21:35:21,554 epoch 9 - iter 594/992 - loss 0.01149657 - time (sec): 35.13 - samples/sec: 2792.52 - lr: 0.000005 - momentum: 0.000000
2023-10-13 21:35:27,248 epoch 9 - iter 693/992 - loss 0.01184055 - time (sec): 40.82 - samples/sec: 2796.65 - lr: 0.000004 - momentum: 0.000000
2023-10-13 21:35:33,220 epoch 9 - iter 792/992 - loss 0.01192689 - time (sec): 46.79 - samples/sec: 2799.57 - lr: 0.000004 - momentum: 0.000000
2023-10-13 21:35:39,316 epoch 9 - iter 891/992 - loss 0.01323888 - time (sec): 52.89 - samples/sec: 2795.24 - lr: 0.000004 - momentum: 0.000000
2023-10-13 21:35:45,243 epoch 9 - iter 990/992 - loss 0.01337615 - time (sec): 58.82 - samples/sec: 2781.18 - lr: 0.000003 - momentum: 0.000000
2023-10-13 21:35:45,385 ----------------------------------------------------------------------------------------------------
2023-10-13 21:35:45,385 EPOCH 9 done: loss 0.0134 - lr: 0.000003
2023-10-13 21:35:49,385 DEV : loss 0.20130668580532074 - f1-score (micro avg) 0.7596
2023-10-13 21:35:49,409 saving best model
2023-10-13 21:35:49,998 ----------------------------------------------------------------------------------------------------
2023-10-13 21:35:56,107 epoch 10 - iter 99/992 - loss 0.01231225 - time (sec): 6.11 - samples/sec: 2791.64 - lr: 0.000003 - momentum: 0.000000
2023-10-13 21:36:01,882 epoch 10 - iter 198/992 - loss 0.00908713 - time (sec): 11.88 - samples/sec: 2758.24 - lr: 0.000003 - momentum: 0.000000
2023-10-13 21:36:07,740 epoch 10 - iter 297/992 - loss 0.00865854 - time (sec): 17.74 - samples/sec: 2751.19 - lr: 0.000002 - momentum: 0.000000
2023-10-13 21:36:13,958 epoch 10 - iter 396/992 - loss 0.00923764 - time (sec): 23.96 - samples/sec: 2731.85 - lr: 0.000002 - momentum: 0.000000
2023-10-13 21:36:19,594 epoch 10 - iter 495/992 - loss 0.00934583 - time (sec): 29.59 - samples/sec: 2753.35 - lr: 0.000002 - momentum: 0.000000
2023-10-13 21:36:25,379 epoch 10 - iter 594/992 - loss 0.00873248 - time (sec): 35.38 - samples/sec: 2769.19 - lr: 0.000001 - momentum: 0.000000
2023-10-13 21:36:31,203 epoch 10 - iter 693/992 - loss 0.00930823 - time (sec): 41.20 - samples/sec: 2774.68 - lr: 0.000001 - momentum: 0.000000
2023-10-13 21:36:37,052 epoch 10 - iter 792/992 - loss 0.00939506 - time (sec): 47.05 - samples/sec: 2782.82 - lr: 0.000001 - momentum: 0.000000
2023-10-13 21:36:42,967 epoch 10 - iter 891/992 - loss 0.00942042 - time (sec): 52.97 - samples/sec: 2794.26 - lr: 0.000000 - momentum: 0.000000
2023-10-13 21:36:48,640 epoch 10 - iter 990/992 - loss 0.00923242 - time (sec): 58.64 - samples/sec: 2790.71 - lr: 0.000000 - momentum: 0.000000
2023-10-13 21:36:48,750 ----------------------------------------------------------------------------------------------------
2023-10-13 21:36:48,750 EPOCH 10 done: loss 0.0092 - lr: 0.000000
2023-10-13 21:36:52,276 DEV : loss 0.21729230880737305 - f1-score (micro avg) 0.7547
2023-10-13 21:36:52,716 ----------------------------------------------------------------------------------------------------
2023-10-13 21:36:52,717 Loading model from best epoch ...
2023-10-13 21:36:54,254 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-13 21:36:57,774
Results:
- F-score (micro) 0.7742
- F-score (macro) 0.7002
- Accuracy 0.6494
By class:
precision recall f1-score support
LOC 0.7980 0.8504 0.8234 655
PER 0.7312 0.8296 0.7773 223
ORG 0.5124 0.4882 0.5000 127
micro avg 0.7500 0.8000 0.7742 1005
macro avg 0.6805 0.7227 0.7002 1005
weighted avg 0.7471 0.8000 0.7723 1005
2023-10-13 21:36:57,774 ----------------------------------------------------------------------------------------------------