stefan-it's picture
Upload folder using huggingface_hub
8f943c4
2023-10-14 10:08:03,990 ----------------------------------------------------------------------------------------------------
2023-10-14 10:08:03,991 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 10:08:03,991 ----------------------------------------------------------------------------------------------------
2023-10-14 10:08:03,992 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-14 10:08:03,992 ----------------------------------------------------------------------------------------------------
2023-10-14 10:08:03,992 Train: 5777 sentences
2023-10-14 10:08:03,992 (train_with_dev=False, train_with_test=False)
2023-10-14 10:08:03,992 ----------------------------------------------------------------------------------------------------
2023-10-14 10:08:03,992 Training Params:
2023-10-14 10:08:03,992 - learning_rate: "3e-05"
2023-10-14 10:08:03,992 - mini_batch_size: "8"
2023-10-14 10:08:03,992 - max_epochs: "10"
2023-10-14 10:08:03,992 - shuffle: "True"
2023-10-14 10:08:03,992 ----------------------------------------------------------------------------------------------------
2023-10-14 10:08:03,992 Plugins:
2023-10-14 10:08:03,992 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 10:08:03,992 ----------------------------------------------------------------------------------------------------
2023-10-14 10:08:03,992 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 10:08:03,992 - metric: "('micro avg', 'f1-score')"
2023-10-14 10:08:03,992 ----------------------------------------------------------------------------------------------------
2023-10-14 10:08:03,992 Computation:
2023-10-14 10:08:03,992 - compute on device: cuda:0
2023-10-14 10:08:03,992 - embedding storage: none
2023-10-14 10:08:03,992 ----------------------------------------------------------------------------------------------------
2023-10-14 10:08:03,992 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-14 10:08:03,992 ----------------------------------------------------------------------------------------------------
2023-10-14 10:08:03,992 ----------------------------------------------------------------------------------------------------
2023-10-14 10:08:09,798 epoch 1 - iter 72/723 - loss 2.06007185 - time (sec): 5.80 - samples/sec: 2987.50 - lr: 0.000003 - momentum: 0.000000
2023-10-14 10:08:15,437 epoch 1 - iter 144/723 - loss 1.20291780 - time (sec): 11.44 - samples/sec: 3042.63 - lr: 0.000006 - momentum: 0.000000
2023-10-14 10:08:20,979 epoch 1 - iter 216/723 - loss 0.87883238 - time (sec): 16.99 - samples/sec: 3054.73 - lr: 0.000009 - momentum: 0.000000
2023-10-14 10:08:26,795 epoch 1 - iter 288/723 - loss 0.71332727 - time (sec): 22.80 - samples/sec: 3045.17 - lr: 0.000012 - momentum: 0.000000
2023-10-14 10:08:33,145 epoch 1 - iter 360/723 - loss 0.59359558 - time (sec): 29.15 - samples/sec: 3051.16 - lr: 0.000015 - momentum: 0.000000
2023-10-14 10:08:39,305 epoch 1 - iter 432/723 - loss 0.52859171 - time (sec): 35.31 - samples/sec: 2996.88 - lr: 0.000018 - momentum: 0.000000
2023-10-14 10:08:44,793 epoch 1 - iter 504/723 - loss 0.47878282 - time (sec): 40.80 - samples/sec: 3002.83 - lr: 0.000021 - momentum: 0.000000
2023-10-14 10:08:50,941 epoch 1 - iter 576/723 - loss 0.43557998 - time (sec): 46.95 - samples/sec: 2992.78 - lr: 0.000024 - momentum: 0.000000
2023-10-14 10:08:56,809 epoch 1 - iter 648/723 - loss 0.40192494 - time (sec): 52.82 - samples/sec: 3001.84 - lr: 0.000027 - momentum: 0.000000
2023-10-14 10:09:02,756 epoch 1 - iter 720/723 - loss 0.37551955 - time (sec): 58.76 - samples/sec: 2993.44 - lr: 0.000030 - momentum: 0.000000
2023-10-14 10:09:02,920 ----------------------------------------------------------------------------------------------------
2023-10-14 10:09:02,920 EPOCH 1 done: loss 0.3754 - lr: 0.000030
2023-10-14 10:09:06,492 DEV : loss 0.13150866329669952 - f1-score (micro avg) 0.6594
2023-10-14 10:09:06,508 saving best model
2023-10-14 10:09:07,015 ----------------------------------------------------------------------------------------------------
2023-10-14 10:09:13,003 epoch 2 - iter 72/723 - loss 0.13450373 - time (sec): 5.99 - samples/sec: 2985.36 - lr: 0.000030 - momentum: 0.000000
2023-10-14 10:09:18,840 epoch 2 - iter 144/723 - loss 0.12552915 - time (sec): 11.82 - samples/sec: 2966.63 - lr: 0.000029 - momentum: 0.000000
2023-10-14 10:09:24,322 epoch 2 - iter 216/723 - loss 0.11969064 - time (sec): 17.30 - samples/sec: 3022.02 - lr: 0.000029 - momentum: 0.000000
2023-10-14 10:09:30,564 epoch 2 - iter 288/723 - loss 0.11742229 - time (sec): 23.55 - samples/sec: 3008.37 - lr: 0.000029 - momentum: 0.000000
2023-10-14 10:09:36,925 epoch 2 - iter 360/723 - loss 0.11264569 - time (sec): 29.91 - samples/sec: 2988.19 - lr: 0.000028 - momentum: 0.000000
2023-10-14 10:09:42,706 epoch 2 - iter 432/723 - loss 0.11031740 - time (sec): 35.69 - samples/sec: 2988.22 - lr: 0.000028 - momentum: 0.000000
2023-10-14 10:09:48,169 epoch 2 - iter 504/723 - loss 0.10834730 - time (sec): 41.15 - samples/sec: 2994.85 - lr: 0.000028 - momentum: 0.000000
2023-10-14 10:09:53,842 epoch 2 - iter 576/723 - loss 0.10492283 - time (sec): 46.83 - samples/sec: 3012.33 - lr: 0.000027 - momentum: 0.000000
2023-10-14 10:09:59,792 epoch 2 - iter 648/723 - loss 0.10618747 - time (sec): 52.78 - samples/sec: 3005.38 - lr: 0.000027 - momentum: 0.000000
2023-10-14 10:10:05,483 epoch 2 - iter 720/723 - loss 0.10530455 - time (sec): 58.47 - samples/sec: 3006.11 - lr: 0.000027 - momentum: 0.000000
2023-10-14 10:10:05,647 ----------------------------------------------------------------------------------------------------
2023-10-14 10:10:05,648 EPOCH 2 done: loss 0.1052 - lr: 0.000027
2023-10-14 10:10:09,197 DEV : loss 0.09088422358036041 - f1-score (micro avg) 0.7818
2023-10-14 10:10:09,213 saving best model
2023-10-14 10:10:09,756 ----------------------------------------------------------------------------------------------------
2023-10-14 10:10:15,427 epoch 3 - iter 72/723 - loss 0.07884216 - time (sec): 5.67 - samples/sec: 3092.57 - lr: 0.000026 - momentum: 0.000000
2023-10-14 10:10:21,258 epoch 3 - iter 144/723 - loss 0.07493376 - time (sec): 11.50 - samples/sec: 3032.42 - lr: 0.000026 - momentum: 0.000000
2023-10-14 10:10:27,358 epoch 3 - iter 216/723 - loss 0.07038820 - time (sec): 17.60 - samples/sec: 3006.94 - lr: 0.000026 - momentum: 0.000000
2023-10-14 10:10:33,575 epoch 3 - iter 288/723 - loss 0.07304398 - time (sec): 23.82 - samples/sec: 2969.72 - lr: 0.000025 - momentum: 0.000000
2023-10-14 10:10:39,604 epoch 3 - iter 360/723 - loss 0.06869986 - time (sec): 29.85 - samples/sec: 2951.49 - lr: 0.000025 - momentum: 0.000000
2023-10-14 10:10:45,313 epoch 3 - iter 432/723 - loss 0.06666406 - time (sec): 35.56 - samples/sec: 2970.21 - lr: 0.000025 - momentum: 0.000000
2023-10-14 10:10:51,783 epoch 3 - iter 504/723 - loss 0.06574320 - time (sec): 42.03 - samples/sec: 2951.83 - lr: 0.000024 - momentum: 0.000000
2023-10-14 10:10:57,354 epoch 3 - iter 576/723 - loss 0.06487572 - time (sec): 47.60 - samples/sec: 2955.14 - lr: 0.000024 - momentum: 0.000000
2023-10-14 10:11:03,334 epoch 3 - iter 648/723 - loss 0.06527005 - time (sec): 53.58 - samples/sec: 2958.50 - lr: 0.000024 - momentum: 0.000000
2023-10-14 10:11:09,185 epoch 3 - iter 720/723 - loss 0.06498541 - time (sec): 59.43 - samples/sec: 2956.97 - lr: 0.000023 - momentum: 0.000000
2023-10-14 10:11:09,364 ----------------------------------------------------------------------------------------------------
2023-10-14 10:11:09,364 EPOCH 3 done: loss 0.0650 - lr: 0.000023
2023-10-14 10:11:13,822 DEV : loss 0.08999822288751602 - f1-score (micro avg) 0.8105
2023-10-14 10:11:13,844 saving best model
2023-10-14 10:11:14,331 ----------------------------------------------------------------------------------------------------
2023-10-14 10:11:20,163 epoch 4 - iter 72/723 - loss 0.04124378 - time (sec): 5.83 - samples/sec: 2961.66 - lr: 0.000023 - momentum: 0.000000
2023-10-14 10:11:26,426 epoch 4 - iter 144/723 - loss 0.04255973 - time (sec): 12.09 - samples/sec: 2874.11 - lr: 0.000023 - momentum: 0.000000
2023-10-14 10:11:32,124 epoch 4 - iter 216/723 - loss 0.04002331 - time (sec): 17.79 - samples/sec: 2872.48 - lr: 0.000022 - momentum: 0.000000
2023-10-14 10:11:38,075 epoch 4 - iter 288/723 - loss 0.04041452 - time (sec): 23.74 - samples/sec: 2921.20 - lr: 0.000022 - momentum: 0.000000
2023-10-14 10:11:43,923 epoch 4 - iter 360/723 - loss 0.04054372 - time (sec): 29.59 - samples/sec: 2940.59 - lr: 0.000022 - momentum: 0.000000
2023-10-14 10:11:50,088 epoch 4 - iter 432/723 - loss 0.04272394 - time (sec): 35.75 - samples/sec: 2951.18 - lr: 0.000021 - momentum: 0.000000
2023-10-14 10:11:56,210 epoch 4 - iter 504/723 - loss 0.04250104 - time (sec): 41.87 - samples/sec: 2961.09 - lr: 0.000021 - momentum: 0.000000
2023-10-14 10:12:02,173 epoch 4 - iter 576/723 - loss 0.04238222 - time (sec): 47.84 - samples/sec: 2936.00 - lr: 0.000021 - momentum: 0.000000
2023-10-14 10:12:07,867 epoch 4 - iter 648/723 - loss 0.04240052 - time (sec): 53.53 - samples/sec: 2942.86 - lr: 0.000020 - momentum: 0.000000
2023-10-14 10:12:13,999 epoch 4 - iter 720/723 - loss 0.04290273 - time (sec): 59.66 - samples/sec: 2947.30 - lr: 0.000020 - momentum: 0.000000
2023-10-14 10:12:14,161 ----------------------------------------------------------------------------------------------------
2023-10-14 10:12:14,162 EPOCH 4 done: loss 0.0428 - lr: 0.000020
2023-10-14 10:12:17,822 DEV : loss 0.11989317834377289 - f1-score (micro avg) 0.7626
2023-10-14 10:12:17,844 ----------------------------------------------------------------------------------------------------
2023-10-14 10:12:25,215 epoch 5 - iter 72/723 - loss 0.02915324 - time (sec): 7.37 - samples/sec: 2537.77 - lr: 0.000020 - momentum: 0.000000
2023-10-14 10:12:30,969 epoch 5 - iter 144/723 - loss 0.03042922 - time (sec): 13.12 - samples/sec: 2734.26 - lr: 0.000019 - momentum: 0.000000
2023-10-14 10:12:37,275 epoch 5 - iter 216/723 - loss 0.03176786 - time (sec): 19.43 - samples/sec: 2779.57 - lr: 0.000019 - momentum: 0.000000
2023-10-14 10:12:43,306 epoch 5 - iter 288/723 - loss 0.03155849 - time (sec): 25.46 - samples/sec: 2814.76 - lr: 0.000019 - momentum: 0.000000
2023-10-14 10:12:49,220 epoch 5 - iter 360/723 - loss 0.03120094 - time (sec): 31.37 - samples/sec: 2841.19 - lr: 0.000018 - momentum: 0.000000
2023-10-14 10:12:55,192 epoch 5 - iter 432/723 - loss 0.03150309 - time (sec): 37.35 - samples/sec: 2856.40 - lr: 0.000018 - momentum: 0.000000
2023-10-14 10:13:01,212 epoch 5 - iter 504/723 - loss 0.03075019 - time (sec): 43.37 - samples/sec: 2845.59 - lr: 0.000018 - momentum: 0.000000
2023-10-14 10:13:07,103 epoch 5 - iter 576/723 - loss 0.03074046 - time (sec): 49.26 - samples/sec: 2859.84 - lr: 0.000017 - momentum: 0.000000
2023-10-14 10:13:13,031 epoch 5 - iter 648/723 - loss 0.02971689 - time (sec): 55.19 - samples/sec: 2871.48 - lr: 0.000017 - momentum: 0.000000
2023-10-14 10:13:19,192 epoch 5 - iter 720/723 - loss 0.03186083 - time (sec): 61.35 - samples/sec: 2863.54 - lr: 0.000017 - momentum: 0.000000
2023-10-14 10:13:19,371 ----------------------------------------------------------------------------------------------------
2023-10-14 10:13:19,371 EPOCH 5 done: loss 0.0318 - lr: 0.000017
2023-10-14 10:13:22,892 DEV : loss 0.11947084218263626 - f1-score (micro avg) 0.807
2023-10-14 10:13:22,908 ----------------------------------------------------------------------------------------------------
2023-10-14 10:13:28,744 epoch 6 - iter 72/723 - loss 0.02063196 - time (sec): 5.84 - samples/sec: 2998.70 - lr: 0.000016 - momentum: 0.000000
2023-10-14 10:13:35,309 epoch 6 - iter 144/723 - loss 0.02278467 - time (sec): 12.40 - samples/sec: 2929.50 - lr: 0.000016 - momentum: 0.000000
2023-10-14 10:13:41,170 epoch 6 - iter 216/723 - loss 0.02607712 - time (sec): 18.26 - samples/sec: 2935.86 - lr: 0.000016 - momentum: 0.000000
2023-10-14 10:13:47,349 epoch 6 - iter 288/723 - loss 0.02571781 - time (sec): 24.44 - samples/sec: 2907.07 - lr: 0.000015 - momentum: 0.000000
2023-10-14 10:13:53,357 epoch 6 - iter 360/723 - loss 0.02737462 - time (sec): 30.45 - samples/sec: 2902.17 - lr: 0.000015 - momentum: 0.000000
2023-10-14 10:13:59,479 epoch 6 - iter 432/723 - loss 0.02692809 - time (sec): 36.57 - samples/sec: 2921.98 - lr: 0.000015 - momentum: 0.000000
2023-10-14 10:14:05,524 epoch 6 - iter 504/723 - loss 0.02745476 - time (sec): 42.61 - samples/sec: 2913.83 - lr: 0.000014 - momentum: 0.000000
2023-10-14 10:14:10,964 epoch 6 - iter 576/723 - loss 0.02632377 - time (sec): 48.06 - samples/sec: 2929.49 - lr: 0.000014 - momentum: 0.000000
2023-10-14 10:14:16,629 epoch 6 - iter 648/723 - loss 0.02661387 - time (sec): 53.72 - samples/sec: 2927.13 - lr: 0.000014 - momentum: 0.000000
2023-10-14 10:14:22,775 epoch 6 - iter 720/723 - loss 0.02605300 - time (sec): 59.87 - samples/sec: 2932.12 - lr: 0.000013 - momentum: 0.000000
2023-10-14 10:14:23,039 ----------------------------------------------------------------------------------------------------
2023-10-14 10:14:23,040 EPOCH 6 done: loss 0.0260 - lr: 0.000013
2023-10-14 10:14:26,933 DEV : loss 0.13547733426094055 - f1-score (micro avg) 0.7987
2023-10-14 10:14:26,948 ----------------------------------------------------------------------------------------------------
2023-10-14 10:14:32,668 epoch 7 - iter 72/723 - loss 0.01058272 - time (sec): 5.72 - samples/sec: 3034.42 - lr: 0.000013 - momentum: 0.000000
2023-10-14 10:14:39,035 epoch 7 - iter 144/723 - loss 0.01503699 - time (sec): 12.09 - samples/sec: 2903.48 - lr: 0.000013 - momentum: 0.000000
2023-10-14 10:14:44,634 epoch 7 - iter 216/723 - loss 0.01870794 - time (sec): 17.68 - samples/sec: 2957.87 - lr: 0.000012 - momentum: 0.000000
2023-10-14 10:14:50,672 epoch 7 - iter 288/723 - loss 0.01832497 - time (sec): 23.72 - samples/sec: 2961.74 - lr: 0.000012 - momentum: 0.000000
2023-10-14 10:14:56,448 epoch 7 - iter 360/723 - loss 0.01843164 - time (sec): 29.50 - samples/sec: 2969.17 - lr: 0.000012 - momentum: 0.000000
2023-10-14 10:15:02,595 epoch 7 - iter 432/723 - loss 0.01926180 - time (sec): 35.65 - samples/sec: 2963.87 - lr: 0.000011 - momentum: 0.000000
2023-10-14 10:15:08,583 epoch 7 - iter 504/723 - loss 0.01935993 - time (sec): 41.63 - samples/sec: 2953.78 - lr: 0.000011 - momentum: 0.000000
2023-10-14 10:15:14,430 epoch 7 - iter 576/723 - loss 0.01890065 - time (sec): 47.48 - samples/sec: 2946.41 - lr: 0.000011 - momentum: 0.000000
2023-10-14 10:15:20,969 epoch 7 - iter 648/723 - loss 0.01861037 - time (sec): 54.02 - samples/sec: 2926.91 - lr: 0.000010 - momentum: 0.000000
2023-10-14 10:15:27,292 epoch 7 - iter 720/723 - loss 0.01854355 - time (sec): 60.34 - samples/sec: 2913.10 - lr: 0.000010 - momentum: 0.000000
2023-10-14 10:15:27,458 ----------------------------------------------------------------------------------------------------
2023-10-14 10:15:27,458 EPOCH 7 done: loss 0.0186 - lr: 0.000010
2023-10-14 10:15:30,986 DEV : loss 0.15326355397701263 - f1-score (micro avg) 0.8083
2023-10-14 10:15:31,002 ----------------------------------------------------------------------------------------------------
2023-10-14 10:15:37,084 epoch 8 - iter 72/723 - loss 0.00922642 - time (sec): 6.08 - samples/sec: 2974.53 - lr: 0.000010 - momentum: 0.000000
2023-10-14 10:15:43,173 epoch 8 - iter 144/723 - loss 0.00982421 - time (sec): 12.17 - samples/sec: 2890.61 - lr: 0.000009 - momentum: 0.000000
2023-10-14 10:15:50,181 epoch 8 - iter 216/723 - loss 0.01240116 - time (sec): 19.18 - samples/sec: 2858.17 - lr: 0.000009 - momentum: 0.000000
2023-10-14 10:15:55,387 epoch 8 - iter 288/723 - loss 0.01275710 - time (sec): 24.38 - samples/sec: 2850.31 - lr: 0.000009 - momentum: 0.000000
2023-10-14 10:16:01,729 epoch 8 - iter 360/723 - loss 0.01362184 - time (sec): 30.73 - samples/sec: 2872.18 - lr: 0.000008 - momentum: 0.000000
2023-10-14 10:16:07,858 epoch 8 - iter 432/723 - loss 0.01349404 - time (sec): 36.86 - samples/sec: 2895.21 - lr: 0.000008 - momentum: 0.000000
2023-10-14 10:16:13,356 epoch 8 - iter 504/723 - loss 0.01283957 - time (sec): 42.35 - samples/sec: 2923.66 - lr: 0.000008 - momentum: 0.000000
2023-10-14 10:16:19,110 epoch 8 - iter 576/723 - loss 0.01301832 - time (sec): 48.11 - samples/sec: 2930.74 - lr: 0.000007 - momentum: 0.000000
2023-10-14 10:16:25,266 epoch 8 - iter 648/723 - loss 0.01330593 - time (sec): 54.26 - samples/sec: 2929.25 - lr: 0.000007 - momentum: 0.000000
2023-10-14 10:16:31,073 epoch 8 - iter 720/723 - loss 0.01398724 - time (sec): 60.07 - samples/sec: 2925.95 - lr: 0.000007 - momentum: 0.000000
2023-10-14 10:16:31,266 ----------------------------------------------------------------------------------------------------
2023-10-14 10:16:31,266 EPOCH 8 done: loss 0.0140 - lr: 0.000007
2023-10-14 10:16:34,869 DEV : loss 0.1955159604549408 - f1-score (micro avg) 0.8047
2023-10-14 10:16:34,892 ----------------------------------------------------------------------------------------------------
2023-10-14 10:16:40,994 epoch 9 - iter 72/723 - loss 0.00962978 - time (sec): 6.10 - samples/sec: 2955.27 - lr: 0.000006 - momentum: 0.000000
2023-10-14 10:16:47,071 epoch 9 - iter 144/723 - loss 0.00819440 - time (sec): 12.18 - samples/sec: 2924.62 - lr: 0.000006 - momentum: 0.000000
2023-10-14 10:16:52,865 epoch 9 - iter 216/723 - loss 0.00783127 - time (sec): 17.97 - samples/sec: 2925.93 - lr: 0.000006 - momentum: 0.000000
2023-10-14 10:16:59,334 epoch 9 - iter 288/723 - loss 0.00813223 - time (sec): 24.44 - samples/sec: 2904.89 - lr: 0.000005 - momentum: 0.000000
2023-10-14 10:17:05,172 epoch 9 - iter 360/723 - loss 0.00940680 - time (sec): 30.28 - samples/sec: 2902.81 - lr: 0.000005 - momentum: 0.000000
2023-10-14 10:17:11,766 epoch 9 - iter 432/723 - loss 0.00956376 - time (sec): 36.87 - samples/sec: 2903.92 - lr: 0.000005 - momentum: 0.000000
2023-10-14 10:17:17,426 epoch 9 - iter 504/723 - loss 0.00940288 - time (sec): 42.53 - samples/sec: 2903.57 - lr: 0.000004 - momentum: 0.000000
2023-10-14 10:17:23,485 epoch 9 - iter 576/723 - loss 0.01068546 - time (sec): 48.59 - samples/sec: 2914.15 - lr: 0.000004 - momentum: 0.000000
2023-10-14 10:17:29,177 epoch 9 - iter 648/723 - loss 0.01130811 - time (sec): 54.28 - samples/sec: 2921.02 - lr: 0.000004 - momentum: 0.000000
2023-10-14 10:17:35,240 epoch 9 - iter 720/723 - loss 0.01166067 - time (sec): 60.35 - samples/sec: 2914.12 - lr: 0.000003 - momentum: 0.000000
2023-10-14 10:17:35,404 ----------------------------------------------------------------------------------------------------
2023-10-14 10:17:35,404 EPOCH 9 done: loss 0.0119 - lr: 0.000003
2023-10-14 10:17:39,827 DEV : loss 0.16821207106113434 - f1-score (micro avg) 0.8241
2023-10-14 10:17:39,847 saving best model
2023-10-14 10:17:40,360 ----------------------------------------------------------------------------------------------------
2023-10-14 10:17:46,342 epoch 10 - iter 72/723 - loss 0.00286898 - time (sec): 5.97 - samples/sec: 2787.87 - lr: 0.000003 - momentum: 0.000000
2023-10-14 10:17:52,930 epoch 10 - iter 144/723 - loss 0.00425928 - time (sec): 12.56 - samples/sec: 2824.56 - lr: 0.000003 - momentum: 0.000000
2023-10-14 10:17:58,963 epoch 10 - iter 216/723 - loss 0.00904585 - time (sec): 18.60 - samples/sec: 2861.44 - lr: 0.000002 - momentum: 0.000000
2023-10-14 10:18:04,706 epoch 10 - iter 288/723 - loss 0.00863157 - time (sec): 24.34 - samples/sec: 2868.59 - lr: 0.000002 - momentum: 0.000000
2023-10-14 10:18:11,407 epoch 10 - iter 360/723 - loss 0.00931463 - time (sec): 31.04 - samples/sec: 2843.47 - lr: 0.000002 - momentum: 0.000000
2023-10-14 10:18:17,207 epoch 10 - iter 432/723 - loss 0.00842242 - time (sec): 36.84 - samples/sec: 2862.15 - lr: 0.000001 - momentum: 0.000000
2023-10-14 10:18:23,428 epoch 10 - iter 504/723 - loss 0.00864621 - time (sec): 43.06 - samples/sec: 2879.88 - lr: 0.000001 - momentum: 0.000000
2023-10-14 10:18:29,354 epoch 10 - iter 576/723 - loss 0.00805536 - time (sec): 48.99 - samples/sec: 2885.00 - lr: 0.000001 - momentum: 0.000000
2023-10-14 10:18:34,991 epoch 10 - iter 648/723 - loss 0.00772375 - time (sec): 54.62 - samples/sec: 2894.74 - lr: 0.000000 - momentum: 0.000000
2023-10-14 10:18:41,108 epoch 10 - iter 720/723 - loss 0.00777814 - time (sec): 60.74 - samples/sec: 2888.92 - lr: 0.000000 - momentum: 0.000000
2023-10-14 10:18:41,439 ----------------------------------------------------------------------------------------------------
2023-10-14 10:18:41,439 EPOCH 10 done: loss 0.0077 - lr: 0.000000
2023-10-14 10:18:45,046 DEV : loss 0.17490935325622559 - f1-score (micro avg) 0.8154
2023-10-14 10:18:45,487 ----------------------------------------------------------------------------------------------------
2023-10-14 10:18:45,489 Loading model from best epoch ...
2023-10-14 10:18:47,257 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-14 10:18:50,812
Results:
- F-score (micro) 0.8239
- F-score (macro) 0.7232
- Accuracy 0.712
By class:
precision recall f1-score support
PER 0.8532 0.8444 0.8488 482
LOC 0.8584 0.8472 0.8527 458
ORG 0.4583 0.4783 0.4681 69
micro avg 0.8272 0.8206 0.8239 1009
macro avg 0.7233 0.7233 0.7232 1009
weighted avg 0.8286 0.8206 0.8246 1009
2023-10-14 10:18:50,812 ----------------------------------------------------------------------------------------------------