stefan-it's picture
Upload folder using huggingface_hub
25f1529
2023-10-14 08:28:49,609 ----------------------------------------------------------------------------------------------------
2023-10-14 08:28:49,610 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 08:28:49,610 ----------------------------------------------------------------------------------------------------
2023-10-14 08:28:49,610 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-14 08:28:49,610 ----------------------------------------------------------------------------------------------------
2023-10-14 08:28:49,610 Train: 5777 sentences
2023-10-14 08:28:49,610 (train_with_dev=False, train_with_test=False)
2023-10-14 08:28:49,610 ----------------------------------------------------------------------------------------------------
2023-10-14 08:28:49,610 Training Params:
2023-10-14 08:28:49,610 - learning_rate: "3e-05"
2023-10-14 08:28:49,610 - mini_batch_size: "8"
2023-10-14 08:28:49,610 - max_epochs: "10"
2023-10-14 08:28:49,610 - shuffle: "True"
2023-10-14 08:28:49,611 ----------------------------------------------------------------------------------------------------
2023-10-14 08:28:49,611 Plugins:
2023-10-14 08:28:49,611 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 08:28:49,611 ----------------------------------------------------------------------------------------------------
2023-10-14 08:28:49,611 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 08:28:49,611 - metric: "('micro avg', 'f1-score')"
2023-10-14 08:28:49,611 ----------------------------------------------------------------------------------------------------
2023-10-14 08:28:49,611 Computation:
2023-10-14 08:28:49,611 - compute on device: cuda:0
2023-10-14 08:28:49,611 - embedding storage: none
2023-10-14 08:28:49,611 ----------------------------------------------------------------------------------------------------
2023-10-14 08:28:49,611 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-14 08:28:49,611 ----------------------------------------------------------------------------------------------------
2023-10-14 08:28:49,611 ----------------------------------------------------------------------------------------------------
2023-10-14 08:28:55,399 epoch 1 - iter 72/723 - loss 2.34500859 - time (sec): 5.79 - samples/sec: 2928.05 - lr: 0.000003 - momentum: 0.000000
2023-10-14 08:29:01,032 epoch 1 - iter 144/723 - loss 1.40146179 - time (sec): 11.42 - samples/sec: 2958.85 - lr: 0.000006 - momentum: 0.000000
2023-10-14 08:29:07,150 epoch 1 - iter 216/723 - loss 0.98472208 - time (sec): 17.54 - samples/sec: 2978.52 - lr: 0.000009 - momentum: 0.000000
2023-10-14 08:29:12,983 epoch 1 - iter 288/723 - loss 0.78948184 - time (sec): 23.37 - samples/sec: 2988.51 - lr: 0.000012 - momentum: 0.000000
2023-10-14 08:29:18,706 epoch 1 - iter 360/723 - loss 0.67269994 - time (sec): 29.09 - samples/sec: 2981.53 - lr: 0.000015 - momentum: 0.000000
2023-10-14 08:29:24,406 epoch 1 - iter 432/723 - loss 0.60136728 - time (sec): 34.79 - samples/sec: 2938.07 - lr: 0.000018 - momentum: 0.000000
2023-10-14 08:29:30,603 epoch 1 - iter 504/723 - loss 0.53425447 - time (sec): 40.99 - samples/sec: 2947.40 - lr: 0.000021 - momentum: 0.000000
2023-10-14 08:29:37,056 epoch 1 - iter 576/723 - loss 0.48656421 - time (sec): 47.44 - samples/sec: 2926.64 - lr: 0.000024 - momentum: 0.000000
2023-10-14 08:29:43,158 epoch 1 - iter 648/723 - loss 0.44962888 - time (sec): 53.55 - samples/sec: 2930.20 - lr: 0.000027 - momentum: 0.000000
2023-10-14 08:29:49,227 epoch 1 - iter 720/723 - loss 0.41686927 - time (sec): 59.62 - samples/sec: 2944.90 - lr: 0.000030 - momentum: 0.000000
2023-10-14 08:29:49,488 ----------------------------------------------------------------------------------------------------
2023-10-14 08:29:49,488 EPOCH 1 done: loss 0.4157 - lr: 0.000030
2023-10-14 08:29:52,663 DEV : loss 0.16736090183258057 - f1-score (micro avg) 0.5276
2023-10-14 08:29:52,698 saving best model
2023-10-14 08:29:53,052 ----------------------------------------------------------------------------------------------------
2023-10-14 08:29:58,787 epoch 2 - iter 72/723 - loss 0.15097610 - time (sec): 5.73 - samples/sec: 3028.16 - lr: 0.000030 - momentum: 0.000000
2023-10-14 08:30:04,682 epoch 2 - iter 144/723 - loss 0.13684182 - time (sec): 11.63 - samples/sec: 2960.60 - lr: 0.000029 - momentum: 0.000000
2023-10-14 08:30:10,736 epoch 2 - iter 216/723 - loss 0.13585466 - time (sec): 17.68 - samples/sec: 2945.67 - lr: 0.000029 - momentum: 0.000000
2023-10-14 08:30:16,269 epoch 2 - iter 288/723 - loss 0.12788984 - time (sec): 23.22 - samples/sec: 2985.08 - lr: 0.000029 - momentum: 0.000000
2023-10-14 08:30:22,637 epoch 2 - iter 360/723 - loss 0.12683567 - time (sec): 29.58 - samples/sec: 2963.23 - lr: 0.000028 - momentum: 0.000000
2023-10-14 08:30:28,367 epoch 2 - iter 432/723 - loss 0.12355013 - time (sec): 35.31 - samples/sec: 2965.98 - lr: 0.000028 - momentum: 0.000000
2023-10-14 08:30:34,488 epoch 2 - iter 504/723 - loss 0.12414425 - time (sec): 41.44 - samples/sec: 2962.13 - lr: 0.000028 - momentum: 0.000000
2023-10-14 08:30:39,830 epoch 2 - iter 576/723 - loss 0.12115131 - time (sec): 46.78 - samples/sec: 2963.61 - lr: 0.000027 - momentum: 0.000000
2023-10-14 08:30:46,149 epoch 2 - iter 648/723 - loss 0.11808836 - time (sec): 53.10 - samples/sec: 2972.27 - lr: 0.000027 - momentum: 0.000000
2023-10-14 08:30:52,188 epoch 2 - iter 720/723 - loss 0.11598087 - time (sec): 59.14 - samples/sec: 2970.40 - lr: 0.000027 - momentum: 0.000000
2023-10-14 08:30:52,397 ----------------------------------------------------------------------------------------------------
2023-10-14 08:30:52,397 EPOCH 2 done: loss 0.1160 - lr: 0.000027
2023-10-14 08:30:56,289 DEV : loss 0.09956270456314087 - f1-score (micro avg) 0.7529
2023-10-14 08:30:56,304 saving best model
2023-10-14 08:30:56,754 ----------------------------------------------------------------------------------------------------
2023-10-14 08:31:02,841 epoch 3 - iter 72/723 - loss 0.07480008 - time (sec): 6.09 - samples/sec: 2911.61 - lr: 0.000026 - momentum: 0.000000
2023-10-14 08:31:08,895 epoch 3 - iter 144/723 - loss 0.06907493 - time (sec): 12.14 - samples/sec: 2929.73 - lr: 0.000026 - momentum: 0.000000
2023-10-14 08:31:14,786 epoch 3 - iter 216/723 - loss 0.07126848 - time (sec): 18.03 - samples/sec: 2950.59 - lr: 0.000026 - momentum: 0.000000
2023-10-14 08:31:20,680 epoch 3 - iter 288/723 - loss 0.06958853 - time (sec): 23.92 - samples/sec: 2960.31 - lr: 0.000025 - momentum: 0.000000
2023-10-14 08:31:26,523 epoch 3 - iter 360/723 - loss 0.06902856 - time (sec): 29.77 - samples/sec: 2974.29 - lr: 0.000025 - momentum: 0.000000
2023-10-14 08:31:32,101 epoch 3 - iter 432/723 - loss 0.06866383 - time (sec): 35.35 - samples/sec: 3001.94 - lr: 0.000025 - momentum: 0.000000
2023-10-14 08:31:38,269 epoch 3 - iter 504/723 - loss 0.07120681 - time (sec): 41.51 - samples/sec: 2963.35 - lr: 0.000024 - momentum: 0.000000
2023-10-14 08:31:44,320 epoch 3 - iter 576/723 - loss 0.06999573 - time (sec): 47.56 - samples/sec: 2969.47 - lr: 0.000024 - momentum: 0.000000
2023-10-14 08:31:50,039 epoch 3 - iter 648/723 - loss 0.06994277 - time (sec): 53.28 - samples/sec: 2982.09 - lr: 0.000024 - momentum: 0.000000
2023-10-14 08:31:56,214 epoch 3 - iter 720/723 - loss 0.06941747 - time (sec): 59.46 - samples/sec: 2956.51 - lr: 0.000023 - momentum: 0.000000
2023-10-14 08:31:56,392 ----------------------------------------------------------------------------------------------------
2023-10-14 08:31:56,392 EPOCH 3 done: loss 0.0694 - lr: 0.000023
2023-10-14 08:31:59,885 DEV : loss 0.09209852665662766 - f1-score (micro avg) 0.7741
2023-10-14 08:31:59,909 saving best model
2023-10-14 08:32:00,444 ----------------------------------------------------------------------------------------------------
2023-10-14 08:32:07,148 epoch 4 - iter 72/723 - loss 0.04111019 - time (sec): 6.70 - samples/sec: 2691.45 - lr: 0.000023 - momentum: 0.000000
2023-10-14 08:32:13,780 epoch 4 - iter 144/723 - loss 0.03743022 - time (sec): 13.33 - samples/sec: 2745.82 - lr: 0.000023 - momentum: 0.000000
2023-10-14 08:32:19,590 epoch 4 - iter 216/723 - loss 0.04032334 - time (sec): 19.14 - samples/sec: 2778.83 - lr: 0.000022 - momentum: 0.000000
2023-10-14 08:32:25,877 epoch 4 - iter 288/723 - loss 0.04209697 - time (sec): 25.43 - samples/sec: 2803.23 - lr: 0.000022 - momentum: 0.000000
2023-10-14 08:32:31,717 epoch 4 - iter 360/723 - loss 0.04334243 - time (sec): 31.27 - samples/sec: 2834.19 - lr: 0.000022 - momentum: 0.000000
2023-10-14 08:32:37,415 epoch 4 - iter 432/723 - loss 0.04378885 - time (sec): 36.97 - samples/sec: 2846.93 - lr: 0.000021 - momentum: 0.000000
2023-10-14 08:32:43,029 epoch 4 - iter 504/723 - loss 0.04269892 - time (sec): 42.58 - samples/sec: 2864.00 - lr: 0.000021 - momentum: 0.000000
2023-10-14 08:32:49,370 epoch 4 - iter 576/723 - loss 0.04332948 - time (sec): 48.92 - samples/sec: 2871.55 - lr: 0.000021 - momentum: 0.000000
2023-10-14 08:32:55,288 epoch 4 - iter 648/723 - loss 0.04450354 - time (sec): 54.84 - samples/sec: 2867.88 - lr: 0.000020 - momentum: 0.000000
2023-10-14 08:33:01,292 epoch 4 - iter 720/723 - loss 0.04532438 - time (sec): 60.84 - samples/sec: 2888.17 - lr: 0.000020 - momentum: 0.000000
2023-10-14 08:33:01,510 ----------------------------------------------------------------------------------------------------
2023-10-14 08:33:01,510 EPOCH 4 done: loss 0.0459 - lr: 0.000020
2023-10-14 08:33:04,963 DEV : loss 0.09347887337207794 - f1-score (micro avg) 0.7897
2023-10-14 08:33:04,979 saving best model
2023-10-14 08:33:05,498 ----------------------------------------------------------------------------------------------------
2023-10-14 08:33:11,770 epoch 5 - iter 72/723 - loss 0.02961405 - time (sec): 6.27 - samples/sec: 2830.72 - lr: 0.000020 - momentum: 0.000000
2023-10-14 08:33:17,831 epoch 5 - iter 144/723 - loss 0.03413594 - time (sec): 12.33 - samples/sec: 2887.48 - lr: 0.000019 - momentum: 0.000000
2023-10-14 08:33:23,316 epoch 5 - iter 216/723 - loss 0.03172083 - time (sec): 17.82 - samples/sec: 2945.68 - lr: 0.000019 - momentum: 0.000000
2023-10-14 08:33:29,114 epoch 5 - iter 288/723 - loss 0.03207149 - time (sec): 23.62 - samples/sec: 2961.38 - lr: 0.000019 - momentum: 0.000000
2023-10-14 08:33:34,760 epoch 5 - iter 360/723 - loss 0.03130345 - time (sec): 29.26 - samples/sec: 2995.97 - lr: 0.000018 - momentum: 0.000000
2023-10-14 08:33:41,220 epoch 5 - iter 432/723 - loss 0.03111424 - time (sec): 35.72 - samples/sec: 2964.12 - lr: 0.000018 - momentum: 0.000000
2023-10-14 08:33:46,931 epoch 5 - iter 504/723 - loss 0.03173225 - time (sec): 41.43 - samples/sec: 2962.14 - lr: 0.000018 - momentum: 0.000000
2023-10-14 08:33:52,899 epoch 5 - iter 576/723 - loss 0.03178811 - time (sec): 47.40 - samples/sec: 2966.06 - lr: 0.000017 - momentum: 0.000000
2023-10-14 08:33:59,266 epoch 5 - iter 648/723 - loss 0.03309214 - time (sec): 53.77 - samples/sec: 2947.98 - lr: 0.000017 - momentum: 0.000000
2023-10-14 08:34:04,879 epoch 5 - iter 720/723 - loss 0.03278333 - time (sec): 59.38 - samples/sec: 2955.61 - lr: 0.000017 - momentum: 0.000000
2023-10-14 08:34:05,223 ----------------------------------------------------------------------------------------------------
2023-10-14 08:34:05,223 EPOCH 5 done: loss 0.0327 - lr: 0.000017
2023-10-14 08:34:09,605 DEV : loss 0.1093878448009491 - f1-score (micro avg) 0.8056
2023-10-14 08:34:09,627 saving best model
2023-10-14 08:34:10,174 ----------------------------------------------------------------------------------------------------
2023-10-14 08:34:16,234 epoch 6 - iter 72/723 - loss 0.02664406 - time (sec): 6.06 - samples/sec: 2967.43 - lr: 0.000016 - momentum: 0.000000
2023-10-14 08:34:22,424 epoch 6 - iter 144/723 - loss 0.02610538 - time (sec): 12.25 - samples/sec: 2951.86 - lr: 0.000016 - momentum: 0.000000
2023-10-14 08:34:28,422 epoch 6 - iter 216/723 - loss 0.02632130 - time (sec): 18.25 - samples/sec: 2958.67 - lr: 0.000016 - momentum: 0.000000
2023-10-14 08:34:34,905 epoch 6 - iter 288/723 - loss 0.02918746 - time (sec): 24.73 - samples/sec: 2906.90 - lr: 0.000015 - momentum: 0.000000
2023-10-14 08:34:40,499 epoch 6 - iter 360/723 - loss 0.02786739 - time (sec): 30.32 - samples/sec: 2923.14 - lr: 0.000015 - momentum: 0.000000
2023-10-14 08:34:46,405 epoch 6 - iter 432/723 - loss 0.02591985 - time (sec): 36.23 - samples/sec: 2914.88 - lr: 0.000015 - momentum: 0.000000
2023-10-14 08:34:52,512 epoch 6 - iter 504/723 - loss 0.02585531 - time (sec): 42.34 - samples/sec: 2913.63 - lr: 0.000014 - momentum: 0.000000
2023-10-14 08:34:58,755 epoch 6 - iter 576/723 - loss 0.02673939 - time (sec): 48.58 - samples/sec: 2916.01 - lr: 0.000014 - momentum: 0.000000
2023-10-14 08:35:04,619 epoch 6 - iter 648/723 - loss 0.02626021 - time (sec): 54.44 - samples/sec: 2908.49 - lr: 0.000014 - momentum: 0.000000
2023-10-14 08:35:10,343 epoch 6 - iter 720/723 - loss 0.02645765 - time (sec): 60.17 - samples/sec: 2919.78 - lr: 0.000013 - momentum: 0.000000
2023-10-14 08:35:10,568 ----------------------------------------------------------------------------------------------------
2023-10-14 08:35:10,568 EPOCH 6 done: loss 0.0264 - lr: 0.000013
2023-10-14 08:35:14,104 DEV : loss 0.1391119509935379 - f1-score (micro avg) 0.7855
2023-10-14 08:35:14,124 ----------------------------------------------------------------------------------------------------
2023-10-14 08:35:20,351 epoch 7 - iter 72/723 - loss 0.01350535 - time (sec): 6.23 - samples/sec: 2796.08 - lr: 0.000013 - momentum: 0.000000
2023-10-14 08:35:26,465 epoch 7 - iter 144/723 - loss 0.01712040 - time (sec): 12.34 - samples/sec: 2766.13 - lr: 0.000013 - momentum: 0.000000
2023-10-14 08:35:33,034 epoch 7 - iter 216/723 - loss 0.01751483 - time (sec): 18.91 - samples/sec: 2774.49 - lr: 0.000012 - momentum: 0.000000
2023-10-14 08:35:39,319 epoch 7 - iter 288/723 - loss 0.01623115 - time (sec): 25.19 - samples/sec: 2798.65 - lr: 0.000012 - momentum: 0.000000
2023-10-14 08:35:45,189 epoch 7 - iter 360/723 - loss 0.01745382 - time (sec): 31.06 - samples/sec: 2833.38 - lr: 0.000012 - momentum: 0.000000
2023-10-14 08:35:51,313 epoch 7 - iter 432/723 - loss 0.01897664 - time (sec): 37.19 - samples/sec: 2863.45 - lr: 0.000011 - momentum: 0.000000
2023-10-14 08:35:56,993 epoch 7 - iter 504/723 - loss 0.01893676 - time (sec): 42.87 - samples/sec: 2874.41 - lr: 0.000011 - momentum: 0.000000
2023-10-14 08:36:02,983 epoch 7 - iter 576/723 - loss 0.01873928 - time (sec): 48.86 - samples/sec: 2894.89 - lr: 0.000011 - momentum: 0.000000
2023-10-14 08:36:08,710 epoch 7 - iter 648/723 - loss 0.01874669 - time (sec): 54.59 - samples/sec: 2894.98 - lr: 0.000010 - momentum: 0.000000
2023-10-14 08:36:14,582 epoch 7 - iter 720/723 - loss 0.01849634 - time (sec): 60.46 - samples/sec: 2906.14 - lr: 0.000010 - momentum: 0.000000
2023-10-14 08:36:14,763 ----------------------------------------------------------------------------------------------------
2023-10-14 08:36:14,763 EPOCH 7 done: loss 0.0185 - lr: 0.000010
2023-10-14 08:36:18,275 DEV : loss 0.17075838148593903 - f1-score (micro avg) 0.8055
2023-10-14 08:36:18,295 ----------------------------------------------------------------------------------------------------
2023-10-14 08:36:24,144 epoch 8 - iter 72/723 - loss 0.01760211 - time (sec): 5.85 - samples/sec: 2982.67 - lr: 0.000010 - momentum: 0.000000
2023-10-14 08:36:30,981 epoch 8 - iter 144/723 - loss 0.01623072 - time (sec): 12.68 - samples/sec: 2781.67 - lr: 0.000009 - momentum: 0.000000
2023-10-14 08:36:36,973 epoch 8 - iter 216/723 - loss 0.01533673 - time (sec): 18.68 - samples/sec: 2833.48 - lr: 0.000009 - momentum: 0.000000
2023-10-14 08:36:42,799 epoch 8 - iter 288/723 - loss 0.01601558 - time (sec): 24.50 - samples/sec: 2864.13 - lr: 0.000009 - momentum: 0.000000
2023-10-14 08:36:49,045 epoch 8 - iter 360/723 - loss 0.01466925 - time (sec): 30.75 - samples/sec: 2897.17 - lr: 0.000008 - momentum: 0.000000
2023-10-14 08:36:54,836 epoch 8 - iter 432/723 - loss 0.01402939 - time (sec): 36.54 - samples/sec: 2898.47 - lr: 0.000008 - momentum: 0.000000
2023-10-14 08:37:00,513 epoch 8 - iter 504/723 - loss 0.01405464 - time (sec): 42.22 - samples/sec: 2926.51 - lr: 0.000008 - momentum: 0.000000
2023-10-14 08:37:06,081 epoch 8 - iter 576/723 - loss 0.01491234 - time (sec): 47.78 - samples/sec: 2932.12 - lr: 0.000007 - momentum: 0.000000
2023-10-14 08:37:12,626 epoch 8 - iter 648/723 - loss 0.01502329 - time (sec): 54.33 - samples/sec: 2916.30 - lr: 0.000007 - momentum: 0.000000
2023-10-14 08:37:18,676 epoch 8 - iter 720/723 - loss 0.01521053 - time (sec): 60.38 - samples/sec: 2912.50 - lr: 0.000007 - momentum: 0.000000
2023-10-14 08:37:18,861 ----------------------------------------------------------------------------------------------------
2023-10-14 08:37:18,861 EPOCH 8 done: loss 0.0152 - lr: 0.000007
2023-10-14 08:37:23,265 DEV : loss 0.17406047880649567 - f1-score (micro avg) 0.7968
2023-10-14 08:37:23,286 ----------------------------------------------------------------------------------------------------
2023-10-14 08:37:29,454 epoch 9 - iter 72/723 - loss 0.01059210 - time (sec): 6.17 - samples/sec: 2956.17 - lr: 0.000006 - momentum: 0.000000
2023-10-14 08:37:36,031 epoch 9 - iter 144/723 - loss 0.01193666 - time (sec): 12.74 - samples/sec: 2885.31 - lr: 0.000006 - momentum: 0.000000
2023-10-14 08:37:42,240 epoch 9 - iter 216/723 - loss 0.01045503 - time (sec): 18.95 - samples/sec: 2951.17 - lr: 0.000006 - momentum: 0.000000
2023-10-14 08:37:47,987 epoch 9 - iter 288/723 - loss 0.00976298 - time (sec): 24.70 - samples/sec: 2920.37 - lr: 0.000005 - momentum: 0.000000
2023-10-14 08:37:54,314 epoch 9 - iter 360/723 - loss 0.01051185 - time (sec): 31.03 - samples/sec: 2926.68 - lr: 0.000005 - momentum: 0.000000
2023-10-14 08:37:59,729 epoch 9 - iter 432/723 - loss 0.01007391 - time (sec): 36.44 - samples/sec: 2939.41 - lr: 0.000005 - momentum: 0.000000
2023-10-14 08:38:05,709 epoch 9 - iter 504/723 - loss 0.01053006 - time (sec): 42.42 - samples/sec: 2926.20 - lr: 0.000004 - momentum: 0.000000
2023-10-14 08:38:11,121 epoch 9 - iter 576/723 - loss 0.01019871 - time (sec): 47.83 - samples/sec: 2929.84 - lr: 0.000004 - momentum: 0.000000
2023-10-14 08:38:17,094 epoch 9 - iter 648/723 - loss 0.01035051 - time (sec): 53.81 - samples/sec: 2927.27 - lr: 0.000004 - momentum: 0.000000
2023-10-14 08:38:23,337 epoch 9 - iter 720/723 - loss 0.01068224 - time (sec): 60.05 - samples/sec: 2925.47 - lr: 0.000003 - momentum: 0.000000
2023-10-14 08:38:23,534 ----------------------------------------------------------------------------------------------------
2023-10-14 08:38:23,534 EPOCH 9 done: loss 0.0107 - lr: 0.000003
2023-10-14 08:38:27,028 DEV : loss 0.1967579573392868 - f1-score (micro avg) 0.7972
2023-10-14 08:38:27,047 ----------------------------------------------------------------------------------------------------
2023-10-14 08:38:33,194 epoch 10 - iter 72/723 - loss 0.00260707 - time (sec): 6.15 - samples/sec: 2990.09 - lr: 0.000003 - momentum: 0.000000
2023-10-14 08:38:38,655 epoch 10 - iter 144/723 - loss 0.00602904 - time (sec): 11.61 - samples/sec: 2988.23 - lr: 0.000003 - momentum: 0.000000
2023-10-14 08:38:44,750 epoch 10 - iter 216/723 - loss 0.01005193 - time (sec): 17.70 - samples/sec: 2974.13 - lr: 0.000002 - momentum: 0.000000
2023-10-14 08:38:51,428 epoch 10 - iter 288/723 - loss 0.00877443 - time (sec): 24.38 - samples/sec: 2908.63 - lr: 0.000002 - momentum: 0.000000
2023-10-14 08:38:56,998 epoch 10 - iter 360/723 - loss 0.00837203 - time (sec): 29.95 - samples/sec: 2933.31 - lr: 0.000002 - momentum: 0.000000
2023-10-14 08:39:03,575 epoch 10 - iter 432/723 - loss 0.00824030 - time (sec): 36.53 - samples/sec: 2931.89 - lr: 0.000001 - momentum: 0.000000
2023-10-14 08:39:09,222 epoch 10 - iter 504/723 - loss 0.00865644 - time (sec): 42.17 - samples/sec: 2942.99 - lr: 0.000001 - momentum: 0.000000
2023-10-14 08:39:14,978 epoch 10 - iter 576/723 - loss 0.00897306 - time (sec): 47.93 - samples/sec: 2944.84 - lr: 0.000001 - momentum: 0.000000
2023-10-14 08:39:20,723 epoch 10 - iter 648/723 - loss 0.00883926 - time (sec): 53.67 - samples/sec: 2940.14 - lr: 0.000000 - momentum: 0.000000
2023-10-14 08:39:26,903 epoch 10 - iter 720/723 - loss 0.00871898 - time (sec): 59.85 - samples/sec: 2938.23 - lr: 0.000000 - momentum: 0.000000
2023-10-14 08:39:27,070 ----------------------------------------------------------------------------------------------------
2023-10-14 08:39:27,070 EPOCH 10 done: loss 0.0088 - lr: 0.000000
2023-10-14 08:39:30,587 DEV : loss 0.2016027718782425 - f1-score (micro avg) 0.7978
2023-10-14 08:39:30,972 ----------------------------------------------------------------------------------------------------
2023-10-14 08:39:30,973 Loading model from best epoch ...
2023-10-14 08:39:32,712 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-14 08:39:35,860
Results:
- F-score (micro) 0.7954
- F-score (macro) 0.6863
- Accuracy 0.6765
By class:
precision recall f1-score support
PER 0.7629 0.8610 0.8090 482
LOC 0.8997 0.7838 0.8378 458
ORG 0.4355 0.3913 0.4122 69
micro avg 0.7970 0.7939 0.7954 1009
macro avg 0.6994 0.6787 0.6863 1009
weighted avg 0.8026 0.7939 0.7949 1009
2023-10-14 08:39:35,861 ----------------------------------------------------------------------------------------------------