stefan-it's picture
Upload folder using huggingface_hub
cd63ca4
2023-10-14 09:04:52,387 ----------------------------------------------------------------------------------------------------
2023-10-14 09:04:52,388 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 09:04:52,388 ----------------------------------------------------------------------------------------------------
2023-10-14 09:04:52,388 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-14 09:04:52,388 ----------------------------------------------------------------------------------------------------
2023-10-14 09:04:52,388 Train: 5777 sentences
2023-10-14 09:04:52,388 (train_with_dev=False, train_with_test=False)
2023-10-14 09:04:52,389 ----------------------------------------------------------------------------------------------------
2023-10-14 09:04:52,389 Training Params:
2023-10-14 09:04:52,389 - learning_rate: "5e-05"
2023-10-14 09:04:52,389 - mini_batch_size: "4"
2023-10-14 09:04:52,389 - max_epochs: "10"
2023-10-14 09:04:52,389 - shuffle: "True"
2023-10-14 09:04:52,389 ----------------------------------------------------------------------------------------------------
2023-10-14 09:04:52,389 Plugins:
2023-10-14 09:04:52,389 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 09:04:52,389 ----------------------------------------------------------------------------------------------------
2023-10-14 09:04:52,389 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 09:04:52,389 - metric: "('micro avg', 'f1-score')"
2023-10-14 09:04:52,389 ----------------------------------------------------------------------------------------------------
2023-10-14 09:04:52,389 Computation:
2023-10-14 09:04:52,389 - compute on device: cuda:0
2023-10-14 09:04:52,389 - embedding storage: none
2023-10-14 09:04:52,389 ----------------------------------------------------------------------------------------------------
2023-10-14 09:04:52,389 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-14 09:04:52,389 ----------------------------------------------------------------------------------------------------
2023-10-14 09:04:52,389 ----------------------------------------------------------------------------------------------------
2023-10-14 09:04:59,731 epoch 1 - iter 144/1445 - loss 1.50405988 - time (sec): 7.34 - samples/sec: 2529.66 - lr: 0.000005 - momentum: 0.000000
2023-10-14 09:05:06,888 epoch 1 - iter 288/1445 - loss 0.91392848 - time (sec): 14.50 - samples/sec: 2426.99 - lr: 0.000010 - momentum: 0.000000
2023-10-14 09:05:13,984 epoch 1 - iter 432/1445 - loss 0.69288586 - time (sec): 21.59 - samples/sec: 2402.59 - lr: 0.000015 - momentum: 0.000000
2023-10-14 09:05:21,264 epoch 1 - iter 576/1445 - loss 0.56388054 - time (sec): 28.87 - samples/sec: 2399.95 - lr: 0.000020 - momentum: 0.000000
2023-10-14 09:05:28,714 epoch 1 - iter 720/1445 - loss 0.48076646 - time (sec): 36.32 - samples/sec: 2419.88 - lr: 0.000025 - momentum: 0.000000
2023-10-14 09:05:35,678 epoch 1 - iter 864/1445 - loss 0.42847356 - time (sec): 43.29 - samples/sec: 2427.69 - lr: 0.000030 - momentum: 0.000000
2023-10-14 09:05:43,142 epoch 1 - iter 1008/1445 - loss 0.39150451 - time (sec): 50.75 - samples/sec: 2417.58 - lr: 0.000035 - momentum: 0.000000
2023-10-14 09:05:50,471 epoch 1 - iter 1152/1445 - loss 0.36318531 - time (sec): 58.08 - samples/sec: 2419.76 - lr: 0.000040 - momentum: 0.000000
2023-10-14 09:05:57,811 epoch 1 - iter 1296/1445 - loss 0.33728572 - time (sec): 65.42 - samples/sec: 2423.80 - lr: 0.000045 - momentum: 0.000000
2023-10-14 09:06:04,943 epoch 1 - iter 1440/1445 - loss 0.31810008 - time (sec): 72.55 - samples/sec: 2424.07 - lr: 0.000050 - momentum: 0.000000
2023-10-14 09:06:05,168 ----------------------------------------------------------------------------------------------------
2023-10-14 09:06:05,168 EPOCH 1 done: loss 0.3178 - lr: 0.000050
2023-10-14 09:06:08,223 DEV : loss 0.22383828461170197 - f1-score (micro avg) 0.2171
2023-10-14 09:06:08,249 saving best model
2023-10-14 09:06:08,623 ----------------------------------------------------------------------------------------------------
2023-10-14 09:06:16,873 epoch 2 - iter 144/1445 - loss 0.12739868 - time (sec): 8.25 - samples/sec: 2132.47 - lr: 0.000049 - momentum: 0.000000
2023-10-14 09:06:24,728 epoch 2 - iter 288/1445 - loss 0.12588351 - time (sec): 16.10 - samples/sec: 2237.63 - lr: 0.000049 - momentum: 0.000000
2023-10-14 09:06:32,016 epoch 2 - iter 432/1445 - loss 0.12372856 - time (sec): 23.39 - samples/sec: 2310.84 - lr: 0.000048 - momentum: 0.000000
2023-10-14 09:06:39,143 epoch 2 - iter 576/1445 - loss 0.12794083 - time (sec): 30.52 - samples/sec: 2340.94 - lr: 0.000048 - momentum: 0.000000
2023-10-14 09:06:46,291 epoch 2 - iter 720/1445 - loss 0.12533257 - time (sec): 37.67 - samples/sec: 2340.89 - lr: 0.000047 - momentum: 0.000000
2023-10-14 09:06:54,019 epoch 2 - iter 864/1445 - loss 0.12315495 - time (sec): 45.39 - samples/sec: 2335.78 - lr: 0.000047 - momentum: 0.000000
2023-10-14 09:07:01,101 epoch 2 - iter 1008/1445 - loss 0.11938446 - time (sec): 52.48 - samples/sec: 2351.84 - lr: 0.000046 - momentum: 0.000000
2023-10-14 09:07:08,628 epoch 2 - iter 1152/1445 - loss 0.11944013 - time (sec): 60.00 - samples/sec: 2359.33 - lr: 0.000046 - momentum: 0.000000
2023-10-14 09:07:15,760 epoch 2 - iter 1296/1445 - loss 0.11754636 - time (sec): 67.13 - samples/sec: 2367.10 - lr: 0.000045 - momentum: 0.000000
2023-10-14 09:07:22,852 epoch 2 - iter 1440/1445 - loss 0.11532296 - time (sec): 74.23 - samples/sec: 2368.20 - lr: 0.000044 - momentum: 0.000000
2023-10-14 09:07:23,072 ----------------------------------------------------------------------------------------------------
2023-10-14 09:07:23,072 EPOCH 2 done: loss 0.1153 - lr: 0.000044
2023-10-14 09:07:26,666 DEV : loss 0.10029048472642899 - f1-score (micro avg) 0.7736
2023-10-14 09:07:26,692 saving best model
2023-10-14 09:07:27,517 ----------------------------------------------------------------------------------------------------
2023-10-14 09:07:35,649 epoch 3 - iter 144/1445 - loss 0.06660446 - time (sec): 8.13 - samples/sec: 2108.84 - lr: 0.000044 - momentum: 0.000000
2023-10-14 09:07:43,714 epoch 3 - iter 288/1445 - loss 0.07183831 - time (sec): 16.19 - samples/sec: 2118.00 - lr: 0.000043 - momentum: 0.000000
2023-10-14 09:07:51,054 epoch 3 - iter 432/1445 - loss 0.07634924 - time (sec): 23.53 - samples/sec: 2189.46 - lr: 0.000043 - momentum: 0.000000
2023-10-14 09:07:59,122 epoch 3 - iter 576/1445 - loss 0.08024319 - time (sec): 31.60 - samples/sec: 2280.87 - lr: 0.000042 - momentum: 0.000000
2023-10-14 09:08:06,177 epoch 3 - iter 720/1445 - loss 0.07906010 - time (sec): 38.66 - samples/sec: 2305.52 - lr: 0.000042 - momentum: 0.000000
2023-10-14 09:08:13,421 epoch 3 - iter 864/1445 - loss 0.07938145 - time (sec): 45.90 - samples/sec: 2315.39 - lr: 0.000041 - momentum: 0.000000
2023-10-14 09:08:20,731 epoch 3 - iter 1008/1445 - loss 0.07745804 - time (sec): 53.21 - samples/sec: 2331.78 - lr: 0.000041 - momentum: 0.000000
2023-10-14 09:08:27,729 epoch 3 - iter 1152/1445 - loss 0.07574091 - time (sec): 60.21 - samples/sec: 2335.63 - lr: 0.000040 - momentum: 0.000000
2023-10-14 09:08:34,856 epoch 3 - iter 1296/1445 - loss 0.07435426 - time (sec): 67.34 - samples/sec: 2353.73 - lr: 0.000039 - momentum: 0.000000
2023-10-14 09:08:41,927 epoch 3 - iter 1440/1445 - loss 0.07446722 - time (sec): 74.41 - samples/sec: 2360.70 - lr: 0.000039 - momentum: 0.000000
2023-10-14 09:08:42,150 ----------------------------------------------------------------------------------------------------
2023-10-14 09:08:42,150 EPOCH 3 done: loss 0.0744 - lr: 0.000039
2023-10-14 09:08:46,047 DEV : loss 0.11615423113107681 - f1-score (micro avg) 0.778
2023-10-14 09:08:46,064 saving best model
2023-10-14 09:08:46,594 ----------------------------------------------------------------------------------------------------
2023-10-14 09:08:54,061 epoch 4 - iter 144/1445 - loss 0.05594958 - time (sec): 7.46 - samples/sec: 2420.83 - lr: 0.000038 - momentum: 0.000000
2023-10-14 09:09:01,045 epoch 4 - iter 288/1445 - loss 0.05956758 - time (sec): 14.45 - samples/sec: 2391.16 - lr: 0.000038 - momentum: 0.000000
2023-10-14 09:09:08,332 epoch 4 - iter 432/1445 - loss 0.05892843 - time (sec): 21.74 - samples/sec: 2399.08 - lr: 0.000037 - momentum: 0.000000
2023-10-14 09:09:15,764 epoch 4 - iter 576/1445 - loss 0.05643042 - time (sec): 29.17 - samples/sec: 2427.62 - lr: 0.000037 - momentum: 0.000000
2023-10-14 09:09:22,692 epoch 4 - iter 720/1445 - loss 0.05466634 - time (sec): 36.10 - samples/sec: 2395.41 - lr: 0.000036 - momentum: 0.000000
2023-10-14 09:09:30,261 epoch 4 - iter 864/1445 - loss 0.05376519 - time (sec): 43.66 - samples/sec: 2414.94 - lr: 0.000036 - momentum: 0.000000
2023-10-14 09:09:37,689 epoch 4 - iter 1008/1445 - loss 0.05239450 - time (sec): 51.09 - samples/sec: 2416.84 - lr: 0.000035 - momentum: 0.000000
2023-10-14 09:09:45,019 epoch 4 - iter 1152/1445 - loss 0.05246057 - time (sec): 58.42 - samples/sec: 2401.90 - lr: 0.000034 - momentum: 0.000000
2023-10-14 09:09:52,017 epoch 4 - iter 1296/1445 - loss 0.05428593 - time (sec): 65.42 - samples/sec: 2392.09 - lr: 0.000034 - momentum: 0.000000
2023-10-14 09:09:59,346 epoch 4 - iter 1440/1445 - loss 0.05316916 - time (sec): 72.75 - samples/sec: 2412.94 - lr: 0.000033 - momentum: 0.000000
2023-10-14 09:09:59,624 ----------------------------------------------------------------------------------------------------
2023-10-14 09:09:59,624 EPOCH 4 done: loss 0.0530 - lr: 0.000033
2023-10-14 09:10:03,275 DEV : loss 0.16486844420433044 - f1-score (micro avg) 0.7972
2023-10-14 09:10:03,297 saving best model
2023-10-14 09:10:03,819 ----------------------------------------------------------------------------------------------------
2023-10-14 09:10:11,682 epoch 5 - iter 144/1445 - loss 0.04301876 - time (sec): 7.86 - samples/sec: 2266.60 - lr: 0.000033 - momentum: 0.000000
2023-10-14 09:10:19,172 epoch 5 - iter 288/1445 - loss 0.04002525 - time (sec): 15.35 - samples/sec: 2415.66 - lr: 0.000032 - momentum: 0.000000
2023-10-14 09:10:27,111 epoch 5 - iter 432/1445 - loss 0.04143923 - time (sec): 23.29 - samples/sec: 2311.89 - lr: 0.000032 - momentum: 0.000000
2023-10-14 09:10:34,937 epoch 5 - iter 576/1445 - loss 0.04156500 - time (sec): 31.11 - samples/sec: 2336.56 - lr: 0.000031 - momentum: 0.000000
2023-10-14 09:10:42,201 epoch 5 - iter 720/1445 - loss 0.04080371 - time (sec): 38.38 - samples/sec: 2331.02 - lr: 0.000031 - momentum: 0.000000
2023-10-14 09:10:49,603 epoch 5 - iter 864/1445 - loss 0.04181842 - time (sec): 45.78 - samples/sec: 2343.37 - lr: 0.000030 - momentum: 0.000000
2023-10-14 09:10:56,781 epoch 5 - iter 1008/1445 - loss 0.04107430 - time (sec): 52.96 - samples/sec: 2351.04 - lr: 0.000029 - momentum: 0.000000
2023-10-14 09:11:04,457 epoch 5 - iter 1152/1445 - loss 0.04027196 - time (sec): 60.63 - samples/sec: 2350.35 - lr: 0.000029 - momentum: 0.000000
2023-10-14 09:11:11,722 epoch 5 - iter 1296/1445 - loss 0.04042528 - time (sec): 67.90 - samples/sec: 2343.40 - lr: 0.000028 - momentum: 0.000000
2023-10-14 09:11:18,740 epoch 5 - iter 1440/1445 - loss 0.03984886 - time (sec): 74.92 - samples/sec: 2345.19 - lr: 0.000028 - momentum: 0.000000
2023-10-14 09:11:18,993 ----------------------------------------------------------------------------------------------------
2023-10-14 09:11:18,993 EPOCH 5 done: loss 0.0399 - lr: 0.000028
2023-10-14 09:11:22,559 DEV : loss 0.15874287486076355 - f1-score (micro avg) 0.7991
2023-10-14 09:11:22,576 saving best model
2023-10-14 09:11:23,086 ----------------------------------------------------------------------------------------------------
2023-10-14 09:11:30,207 epoch 6 - iter 144/1445 - loss 0.02521130 - time (sec): 7.12 - samples/sec: 2399.62 - lr: 0.000027 - momentum: 0.000000
2023-10-14 09:11:37,191 epoch 6 - iter 288/1445 - loss 0.02915345 - time (sec): 14.10 - samples/sec: 2371.57 - lr: 0.000027 - momentum: 0.000000
2023-10-14 09:11:44,390 epoch 6 - iter 432/1445 - loss 0.02889368 - time (sec): 21.30 - samples/sec: 2397.50 - lr: 0.000026 - momentum: 0.000000
2023-10-14 09:11:51,631 epoch 6 - iter 576/1445 - loss 0.02703670 - time (sec): 28.54 - samples/sec: 2404.49 - lr: 0.000026 - momentum: 0.000000
2023-10-14 09:11:59,154 epoch 6 - iter 720/1445 - loss 0.02550238 - time (sec): 36.07 - samples/sec: 2413.63 - lr: 0.000025 - momentum: 0.000000
2023-10-14 09:12:06,648 epoch 6 - iter 864/1445 - loss 0.02674173 - time (sec): 43.56 - samples/sec: 2403.51 - lr: 0.000024 - momentum: 0.000000
2023-10-14 09:12:13,801 epoch 6 - iter 1008/1445 - loss 0.02707634 - time (sec): 50.71 - samples/sec: 2404.91 - lr: 0.000024 - momentum: 0.000000
2023-10-14 09:12:21,106 epoch 6 - iter 1152/1445 - loss 0.02669468 - time (sec): 58.02 - samples/sec: 2414.59 - lr: 0.000023 - momentum: 0.000000
2023-10-14 09:12:28,455 epoch 6 - iter 1296/1445 - loss 0.02836933 - time (sec): 65.37 - samples/sec: 2419.19 - lr: 0.000023 - momentum: 0.000000
2023-10-14 09:12:35,768 epoch 6 - iter 1440/1445 - loss 0.02995850 - time (sec): 72.68 - samples/sec: 2418.03 - lr: 0.000022 - momentum: 0.000000
2023-10-14 09:12:35,987 ----------------------------------------------------------------------------------------------------
2023-10-14 09:12:35,987 EPOCH 6 done: loss 0.0299 - lr: 0.000022
2023-10-14 09:12:39,997 DEV : loss 0.21732811629772186 - f1-score (micro avg) 0.7943
2023-10-14 09:12:40,014 ----------------------------------------------------------------------------------------------------
2023-10-14 09:12:47,625 epoch 7 - iter 144/1445 - loss 0.01983330 - time (sec): 7.61 - samples/sec: 2465.73 - lr: 0.000022 - momentum: 0.000000
2023-10-14 09:12:55,063 epoch 7 - iter 288/1445 - loss 0.02159468 - time (sec): 15.05 - samples/sec: 2432.94 - lr: 0.000021 - momentum: 0.000000
2023-10-14 09:13:02,204 epoch 7 - iter 432/1445 - loss 0.02251919 - time (sec): 22.19 - samples/sec: 2420.12 - lr: 0.000021 - momentum: 0.000000
2023-10-14 09:13:09,408 epoch 7 - iter 576/1445 - loss 0.02342106 - time (sec): 29.39 - samples/sec: 2410.31 - lr: 0.000020 - momentum: 0.000000
2023-10-14 09:13:16,649 epoch 7 - iter 720/1445 - loss 0.02136917 - time (sec): 36.63 - samples/sec: 2420.11 - lr: 0.000019 - momentum: 0.000000
2023-10-14 09:13:24,040 epoch 7 - iter 864/1445 - loss 0.02109943 - time (sec): 44.03 - samples/sec: 2428.59 - lr: 0.000019 - momentum: 0.000000
2023-10-14 09:13:31,063 epoch 7 - iter 1008/1445 - loss 0.02065832 - time (sec): 51.05 - samples/sec: 2417.46 - lr: 0.000018 - momentum: 0.000000
2023-10-14 09:13:38,030 epoch 7 - iter 1152/1445 - loss 0.02010615 - time (sec): 58.01 - samples/sec: 2409.33 - lr: 0.000018 - momentum: 0.000000
2023-10-14 09:13:45,589 epoch 7 - iter 1296/1445 - loss 0.02234686 - time (sec): 65.57 - samples/sec: 2401.30 - lr: 0.000017 - momentum: 0.000000
2023-10-14 09:13:53,195 epoch 7 - iter 1440/1445 - loss 0.02249980 - time (sec): 73.18 - samples/sec: 2401.41 - lr: 0.000017 - momentum: 0.000000
2023-10-14 09:13:53,429 ----------------------------------------------------------------------------------------------------
2023-10-14 09:13:53,429 EPOCH 7 done: loss 0.0224 - lr: 0.000017
2023-10-14 09:13:57,001 DEV : loss 0.17771700024604797 - f1-score (micro avg) 0.8009
2023-10-14 09:13:57,019 saving best model
2023-10-14 09:13:57,654 ----------------------------------------------------------------------------------------------------
2023-10-14 09:14:04,754 epoch 8 - iter 144/1445 - loss 0.01590043 - time (sec): 7.10 - samples/sec: 2420.47 - lr: 0.000016 - momentum: 0.000000
2023-10-14 09:14:12,074 epoch 8 - iter 288/1445 - loss 0.01595828 - time (sec): 14.42 - samples/sec: 2413.02 - lr: 0.000016 - momentum: 0.000000
2023-10-14 09:14:19,826 epoch 8 - iter 432/1445 - loss 0.01610757 - time (sec): 22.17 - samples/sec: 2370.89 - lr: 0.000015 - momentum: 0.000000
2023-10-14 09:14:27,167 epoch 8 - iter 576/1445 - loss 0.01647436 - time (sec): 29.51 - samples/sec: 2362.40 - lr: 0.000014 - momentum: 0.000000
2023-10-14 09:14:34,347 epoch 8 - iter 720/1445 - loss 0.01689704 - time (sec): 36.69 - samples/sec: 2370.89 - lr: 0.000014 - momentum: 0.000000
2023-10-14 09:14:41,943 epoch 8 - iter 864/1445 - loss 0.01595512 - time (sec): 44.29 - samples/sec: 2366.99 - lr: 0.000013 - momentum: 0.000000
2023-10-14 09:14:49,142 epoch 8 - iter 1008/1445 - loss 0.01521523 - time (sec): 51.49 - samples/sec: 2373.12 - lr: 0.000013 - momentum: 0.000000
2023-10-14 09:14:56,752 epoch 8 - iter 1152/1445 - loss 0.01566291 - time (sec): 59.10 - samples/sec: 2374.23 - lr: 0.000012 - momentum: 0.000000
2023-10-14 09:15:04,136 epoch 8 - iter 1296/1445 - loss 0.01524212 - time (sec): 66.48 - samples/sec: 2380.20 - lr: 0.000012 - momentum: 0.000000
2023-10-14 09:15:11,600 epoch 8 - iter 1440/1445 - loss 0.01548357 - time (sec): 73.94 - samples/sec: 2374.29 - lr: 0.000011 - momentum: 0.000000
2023-10-14 09:15:11,887 ----------------------------------------------------------------------------------------------------
2023-10-14 09:15:11,887 EPOCH 8 done: loss 0.0157 - lr: 0.000011
2023-10-14 09:15:15,393 DEV : loss 0.18877749145030975 - f1-score (micro avg) 0.8117
2023-10-14 09:15:15,410 saving best model
2023-10-14 09:15:15,872 ----------------------------------------------------------------------------------------------------
2023-10-14 09:15:24,218 epoch 9 - iter 144/1445 - loss 0.01359555 - time (sec): 8.34 - samples/sec: 2153.76 - lr: 0.000011 - momentum: 0.000000
2023-10-14 09:15:32,316 epoch 9 - iter 288/1445 - loss 0.01054503 - time (sec): 16.44 - samples/sec: 2214.35 - lr: 0.000010 - momentum: 0.000000
2023-10-14 09:15:40,564 epoch 9 - iter 432/1445 - loss 0.01149262 - time (sec): 24.69 - samples/sec: 2159.06 - lr: 0.000009 - momentum: 0.000000
2023-10-14 09:15:48,781 epoch 9 - iter 576/1445 - loss 0.00991121 - time (sec): 32.91 - samples/sec: 2155.60 - lr: 0.000009 - momentum: 0.000000
2023-10-14 09:15:57,145 epoch 9 - iter 720/1445 - loss 0.01022072 - time (sec): 41.27 - samples/sec: 2165.38 - lr: 0.000008 - momentum: 0.000000
2023-10-14 09:16:05,349 epoch 9 - iter 864/1445 - loss 0.01065988 - time (sec): 49.47 - samples/sec: 2175.46 - lr: 0.000008 - momentum: 0.000000
2023-10-14 09:16:13,454 epoch 9 - iter 1008/1445 - loss 0.01047689 - time (sec): 57.58 - samples/sec: 2176.55 - lr: 0.000007 - momentum: 0.000000
2023-10-14 09:16:21,315 epoch 9 - iter 1152/1445 - loss 0.01036723 - time (sec): 65.44 - samples/sec: 2165.04 - lr: 0.000007 - momentum: 0.000000
2023-10-14 09:16:29,408 epoch 9 - iter 1296/1445 - loss 0.01166652 - time (sec): 73.53 - samples/sec: 2169.75 - lr: 0.000006 - momentum: 0.000000
2023-10-14 09:16:37,351 epoch 9 - iter 1440/1445 - loss 0.01098843 - time (sec): 81.48 - samples/sec: 2156.71 - lr: 0.000006 - momentum: 0.000000
2023-10-14 09:16:37,608 ----------------------------------------------------------------------------------------------------
2023-10-14 09:16:37,608 EPOCH 9 done: loss 0.0110 - lr: 0.000006
2023-10-14 09:16:41,539 DEV : loss 0.20204880833625793 - f1-score (micro avg) 0.8105
2023-10-14 09:16:41,555 ----------------------------------------------------------------------------------------------------
2023-10-14 09:16:48,967 epoch 10 - iter 144/1445 - loss 0.00491564 - time (sec): 7.41 - samples/sec: 2411.29 - lr: 0.000005 - momentum: 0.000000
2023-10-14 09:16:56,199 epoch 10 - iter 288/1445 - loss 0.00600497 - time (sec): 14.64 - samples/sec: 2387.86 - lr: 0.000004 - momentum: 0.000000
2023-10-14 09:17:03,075 epoch 10 - iter 432/1445 - loss 0.00613285 - time (sec): 21.52 - samples/sec: 2398.58 - lr: 0.000004 - momentum: 0.000000
2023-10-14 09:17:10,266 epoch 10 - iter 576/1445 - loss 0.00629679 - time (sec): 28.71 - samples/sec: 2427.46 - lr: 0.000003 - momentum: 0.000000
2023-10-14 09:17:17,507 epoch 10 - iter 720/1445 - loss 0.00605855 - time (sec): 35.95 - samples/sec: 2413.82 - lr: 0.000003 - momentum: 0.000000
2023-10-14 09:17:24,864 epoch 10 - iter 864/1445 - loss 0.00744087 - time (sec): 43.31 - samples/sec: 2421.84 - lr: 0.000002 - momentum: 0.000000
2023-10-14 09:17:32,104 epoch 10 - iter 1008/1445 - loss 0.00687721 - time (sec): 50.55 - samples/sec: 2429.28 - lr: 0.000002 - momentum: 0.000000
2023-10-14 09:17:39,595 epoch 10 - iter 1152/1445 - loss 0.00728344 - time (sec): 58.04 - samples/sec: 2427.34 - lr: 0.000001 - momentum: 0.000000
2023-10-14 09:17:46,851 epoch 10 - iter 1296/1445 - loss 0.00674719 - time (sec): 65.29 - samples/sec: 2431.63 - lr: 0.000001 - momentum: 0.000000
2023-10-14 09:17:53,975 epoch 10 - iter 1440/1445 - loss 0.00726142 - time (sec): 72.42 - samples/sec: 2425.99 - lr: 0.000000 - momentum: 0.000000
2023-10-14 09:17:54,218 ----------------------------------------------------------------------------------------------------
2023-10-14 09:17:54,218 EPOCH 10 done: loss 0.0072 - lr: 0.000000
2023-10-14 09:17:57,706 DEV : loss 0.20339810848236084 - f1-score (micro avg) 0.822
2023-10-14 09:17:57,722 saving best model
2023-10-14 09:17:58,576 ----------------------------------------------------------------------------------------------------
2023-10-14 09:17:58,577 Loading model from best epoch ...
2023-10-14 09:18:00,174 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-14 09:18:03,299
Results:
- F-score (micro) 0.7921
- F-score (macro) 0.6958
- Accuracy 0.6684
By class:
precision recall f1-score support
PER 0.8134 0.7780 0.7953 482
LOC 0.8864 0.7838 0.8320 458
ORG 0.5909 0.3768 0.4602 69
micro avg 0.8352 0.7532 0.7921 1009
macro avg 0.7636 0.6462 0.6958 1009
weighted avg 0.8314 0.7532 0.7890 1009
2023-10-14 09:18:03,300 ----------------------------------------------------------------------------------------------------