stefan-it's picture
Upload folder using huggingface_hub
756cd43
2023-10-14 11:33:32,804 ----------------------------------------------------------------------------------------------------
2023-10-14 11:33:32,805 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 11:33:32,805 ----------------------------------------------------------------------------------------------------
2023-10-14 11:33:32,805 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-14 11:33:32,805 ----------------------------------------------------------------------------------------------------
2023-10-14 11:33:32,805 Train: 5777 sentences
2023-10-14 11:33:32,805 (train_with_dev=False, train_with_test=False)
2023-10-14 11:33:32,805 ----------------------------------------------------------------------------------------------------
2023-10-14 11:33:32,805 Training Params:
2023-10-14 11:33:32,805 - learning_rate: "5e-05"
2023-10-14 11:33:32,806 - mini_batch_size: "4"
2023-10-14 11:33:32,806 - max_epochs: "10"
2023-10-14 11:33:32,806 - shuffle: "True"
2023-10-14 11:33:32,806 ----------------------------------------------------------------------------------------------------
2023-10-14 11:33:32,806 Plugins:
2023-10-14 11:33:32,806 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 11:33:32,806 ----------------------------------------------------------------------------------------------------
2023-10-14 11:33:32,806 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 11:33:32,806 - metric: "('micro avg', 'f1-score')"
2023-10-14 11:33:32,806 ----------------------------------------------------------------------------------------------------
2023-10-14 11:33:32,806 Computation:
2023-10-14 11:33:32,806 - compute on device: cuda:0
2023-10-14 11:33:32,806 - embedding storage: none
2023-10-14 11:33:32,806 ----------------------------------------------------------------------------------------------------
2023-10-14 11:33:32,806 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-14 11:33:32,806 ----------------------------------------------------------------------------------------------------
2023-10-14 11:33:32,806 ----------------------------------------------------------------------------------------------------
2023-10-14 11:33:41,067 epoch 1 - iter 144/1445 - loss 1.43369159 - time (sec): 8.26 - samples/sec: 2245.44 - lr: 0.000005 - momentum: 0.000000
2023-10-14 11:33:48,582 epoch 1 - iter 288/1445 - loss 0.85898979 - time (sec): 15.78 - samples/sec: 2276.91 - lr: 0.000010 - momentum: 0.000000
2023-10-14 11:33:55,828 epoch 1 - iter 432/1445 - loss 0.64395719 - time (sec): 23.02 - samples/sec: 2316.75 - lr: 0.000015 - momentum: 0.000000
2023-10-14 11:34:03,175 epoch 1 - iter 576/1445 - loss 0.53253749 - time (sec): 30.37 - samples/sec: 2350.21 - lr: 0.000020 - momentum: 0.000000
2023-10-14 11:34:10,523 epoch 1 - iter 720/1445 - loss 0.46097312 - time (sec): 37.72 - samples/sec: 2351.32 - lr: 0.000025 - momentum: 0.000000
2023-10-14 11:34:17,871 epoch 1 - iter 864/1445 - loss 0.41136182 - time (sec): 45.06 - samples/sec: 2357.69 - lr: 0.000030 - momentum: 0.000000
2023-10-14 11:34:25,309 epoch 1 - iter 1008/1445 - loss 0.37502039 - time (sec): 52.50 - samples/sec: 2361.41 - lr: 0.000035 - momentum: 0.000000
2023-10-14 11:34:32,412 epoch 1 - iter 1152/1445 - loss 0.34642014 - time (sec): 59.61 - samples/sec: 2352.91 - lr: 0.000040 - momentum: 0.000000
2023-10-14 11:34:39,655 epoch 1 - iter 1296/1445 - loss 0.32134796 - time (sec): 66.85 - samples/sec: 2361.73 - lr: 0.000045 - momentum: 0.000000
2023-10-14 11:34:46,952 epoch 1 - iter 1440/1445 - loss 0.30279384 - time (sec): 74.14 - samples/sec: 2368.90 - lr: 0.000050 - momentum: 0.000000
2023-10-14 11:34:47,192 ----------------------------------------------------------------------------------------------------
2023-10-14 11:34:47,193 EPOCH 1 done: loss 0.3024 - lr: 0.000050
2023-10-14 11:34:50,329 DEV : loss 0.14426745474338531 - f1-score (micro avg) 0.631
2023-10-14 11:34:50,345 saving best model
2023-10-14 11:34:50,783 ----------------------------------------------------------------------------------------------------
2023-10-14 11:34:58,203 epoch 2 - iter 144/1445 - loss 0.11871103 - time (sec): 7.42 - samples/sec: 2306.42 - lr: 0.000049 - momentum: 0.000000
2023-10-14 11:35:05,309 epoch 2 - iter 288/1445 - loss 0.12164115 - time (sec): 14.52 - samples/sec: 2350.70 - lr: 0.000049 - momentum: 0.000000
2023-10-14 11:35:12,783 epoch 2 - iter 432/1445 - loss 0.11897072 - time (sec): 22.00 - samples/sec: 2382.87 - lr: 0.000048 - momentum: 0.000000
2023-10-14 11:35:20,278 epoch 2 - iter 576/1445 - loss 0.11592658 - time (sec): 29.49 - samples/sec: 2404.28 - lr: 0.000048 - momentum: 0.000000
2023-10-14 11:35:27,275 epoch 2 - iter 720/1445 - loss 0.11482763 - time (sec): 36.49 - samples/sec: 2410.85 - lr: 0.000047 - momentum: 0.000000
2023-10-14 11:35:34,371 epoch 2 - iter 864/1445 - loss 0.11209204 - time (sec): 43.59 - samples/sec: 2402.63 - lr: 0.000047 - momentum: 0.000000
2023-10-14 11:35:41,763 epoch 2 - iter 1008/1445 - loss 0.11071289 - time (sec): 50.98 - samples/sec: 2399.75 - lr: 0.000046 - momentum: 0.000000
2023-10-14 11:35:49,360 epoch 2 - iter 1152/1445 - loss 0.10882393 - time (sec): 58.58 - samples/sec: 2393.42 - lr: 0.000046 - momentum: 0.000000
2023-10-14 11:35:56,980 epoch 2 - iter 1296/1445 - loss 0.10920607 - time (sec): 66.20 - samples/sec: 2389.87 - lr: 0.000045 - momentum: 0.000000
2023-10-14 11:36:04,656 epoch 2 - iter 1440/1445 - loss 0.10848901 - time (sec): 73.87 - samples/sec: 2379.55 - lr: 0.000044 - momentum: 0.000000
2023-10-14 11:36:04,876 ----------------------------------------------------------------------------------------------------
2023-10-14 11:36:04,876 EPOCH 2 done: loss 0.1084 - lr: 0.000044
2023-10-14 11:36:08,445 DEV : loss 0.08696628361940384 - f1-score (micro avg) 0.8056
2023-10-14 11:36:08,470 saving best model
2023-10-14 11:36:09,027 ----------------------------------------------------------------------------------------------------
2023-10-14 11:36:16,208 epoch 3 - iter 144/1445 - loss 0.08013974 - time (sec): 7.18 - samples/sec: 2363.46 - lr: 0.000044 - momentum: 0.000000
2023-10-14 11:36:23,636 epoch 3 - iter 288/1445 - loss 0.07495514 - time (sec): 14.61 - samples/sec: 2333.06 - lr: 0.000043 - momentum: 0.000000
2023-10-14 11:36:30,995 epoch 3 - iter 432/1445 - loss 0.07427969 - time (sec): 21.97 - samples/sec: 2329.93 - lr: 0.000043 - momentum: 0.000000
2023-10-14 11:36:38,057 epoch 3 - iter 576/1445 - loss 0.07707753 - time (sec): 29.03 - samples/sec: 2340.34 - lr: 0.000042 - momentum: 0.000000
2023-10-14 11:36:45,180 epoch 3 - iter 720/1445 - loss 0.07618608 - time (sec): 36.15 - samples/sec: 2332.09 - lr: 0.000042 - momentum: 0.000000
2023-10-14 11:36:52,728 epoch 3 - iter 864/1445 - loss 0.07322862 - time (sec): 43.70 - samples/sec: 2362.62 - lr: 0.000041 - momentum: 0.000000
2023-10-14 11:37:00,007 epoch 3 - iter 1008/1445 - loss 0.07343540 - time (sec): 50.98 - samples/sec: 2362.17 - lr: 0.000041 - momentum: 0.000000
2023-10-14 11:37:07,468 epoch 3 - iter 1152/1445 - loss 0.07535018 - time (sec): 58.44 - samples/sec: 2380.78 - lr: 0.000040 - momentum: 0.000000
2023-10-14 11:37:15,113 epoch 3 - iter 1296/1445 - loss 0.07471681 - time (sec): 66.08 - samples/sec: 2374.03 - lr: 0.000039 - momentum: 0.000000
2023-10-14 11:37:22,648 epoch 3 - iter 1440/1445 - loss 0.07360689 - time (sec): 73.62 - samples/sec: 2387.80 - lr: 0.000039 - momentum: 0.000000
2023-10-14 11:37:22,869 ----------------------------------------------------------------------------------------------------
2023-10-14 11:37:22,869 EPOCH 3 done: loss 0.0737 - lr: 0.000039
2023-10-14 11:37:26,430 DEV : loss 0.09837064146995544 - f1-score (micro avg) 0.7909
2023-10-14 11:37:26,451 ----------------------------------------------------------------------------------------------------
2023-10-14 11:37:35,513 epoch 4 - iter 144/1445 - loss 0.04038213 - time (sec): 9.06 - samples/sec: 1871.02 - lr: 0.000038 - momentum: 0.000000
2023-10-14 11:37:43,810 epoch 4 - iter 288/1445 - loss 0.04880213 - time (sec): 17.36 - samples/sec: 2065.39 - lr: 0.000038 - momentum: 0.000000
2023-10-14 11:37:51,934 epoch 4 - iter 432/1445 - loss 0.04623390 - time (sec): 25.48 - samples/sec: 2080.27 - lr: 0.000037 - momentum: 0.000000
2023-10-14 11:37:59,626 epoch 4 - iter 576/1445 - loss 0.05140673 - time (sec): 33.17 - samples/sec: 2130.40 - lr: 0.000037 - momentum: 0.000000
2023-10-14 11:38:06,763 epoch 4 - iter 720/1445 - loss 0.05242646 - time (sec): 40.31 - samples/sec: 2168.66 - lr: 0.000036 - momentum: 0.000000
2023-10-14 11:38:14,123 epoch 4 - iter 864/1445 - loss 0.05243404 - time (sec): 47.67 - samples/sec: 2203.65 - lr: 0.000036 - momentum: 0.000000
2023-10-14 11:38:21,439 epoch 4 - iter 1008/1445 - loss 0.05352965 - time (sec): 54.99 - samples/sec: 2238.25 - lr: 0.000035 - momentum: 0.000000
2023-10-14 11:38:28,637 epoch 4 - iter 1152/1445 - loss 0.05228346 - time (sec): 62.18 - samples/sec: 2247.00 - lr: 0.000034 - momentum: 0.000000
2023-10-14 11:38:35,882 epoch 4 - iter 1296/1445 - loss 0.05354625 - time (sec): 69.43 - samples/sec: 2272.85 - lr: 0.000034 - momentum: 0.000000
2023-10-14 11:38:43,190 epoch 4 - iter 1440/1445 - loss 0.05446189 - time (sec): 76.74 - samples/sec: 2290.88 - lr: 0.000033 - momentum: 0.000000
2023-10-14 11:38:43,412 ----------------------------------------------------------------------------------------------------
2023-10-14 11:38:43,413 EPOCH 4 done: loss 0.0544 - lr: 0.000033
2023-10-14 11:38:47,049 DEV : loss 0.13684163987636566 - f1-score (micro avg) 0.7735
2023-10-14 11:38:47,070 ----------------------------------------------------------------------------------------------------
2023-10-14 11:38:54,510 epoch 5 - iter 144/1445 - loss 0.03982192 - time (sec): 7.44 - samples/sec: 2235.21 - lr: 0.000033 - momentum: 0.000000
2023-10-14 11:39:01,852 epoch 5 - iter 288/1445 - loss 0.03969703 - time (sec): 14.78 - samples/sec: 2260.48 - lr: 0.000032 - momentum: 0.000000
2023-10-14 11:39:09,061 epoch 5 - iter 432/1445 - loss 0.04370314 - time (sec): 21.99 - samples/sec: 2299.81 - lr: 0.000032 - momentum: 0.000000
2023-10-14 11:39:16,385 epoch 5 - iter 576/1445 - loss 0.04527434 - time (sec): 29.31 - samples/sec: 2359.50 - lr: 0.000031 - momentum: 0.000000
2023-10-14 11:39:24,018 epoch 5 - iter 720/1445 - loss 0.04759992 - time (sec): 36.95 - samples/sec: 2359.56 - lr: 0.000031 - momentum: 0.000000
2023-10-14 11:39:31,564 epoch 5 - iter 864/1445 - loss 0.04728067 - time (sec): 44.49 - samples/sec: 2367.71 - lr: 0.000030 - momentum: 0.000000
2023-10-14 11:39:39,124 epoch 5 - iter 1008/1445 - loss 0.04654287 - time (sec): 52.05 - samples/sec: 2384.71 - lr: 0.000029 - momentum: 0.000000
2023-10-14 11:39:46,236 epoch 5 - iter 1152/1445 - loss 0.04540814 - time (sec): 59.16 - samples/sec: 2381.08 - lr: 0.000029 - momentum: 0.000000
2023-10-14 11:39:53,236 epoch 5 - iter 1296/1445 - loss 0.04372098 - time (sec): 66.16 - samples/sec: 2388.85 - lr: 0.000028 - momentum: 0.000000
2023-10-14 11:40:00,506 epoch 5 - iter 1440/1445 - loss 0.04412868 - time (sec): 73.44 - samples/sec: 2388.38 - lr: 0.000028 - momentum: 0.000000
2023-10-14 11:40:00,795 ----------------------------------------------------------------------------------------------------
2023-10-14 11:40:00,795 EPOCH 5 done: loss 0.0442 - lr: 0.000028
2023-10-14 11:40:04,785 DEV : loss 0.1434909701347351 - f1-score (micro avg) 0.8048
2023-10-14 11:40:04,802 ----------------------------------------------------------------------------------------------------
2023-10-14 11:40:12,079 epoch 6 - iter 144/1445 - loss 0.03394452 - time (sec): 7.28 - samples/sec: 2323.74 - lr: 0.000027 - momentum: 0.000000
2023-10-14 11:40:19,193 epoch 6 - iter 288/1445 - loss 0.03447812 - time (sec): 14.39 - samples/sec: 2379.30 - lr: 0.000027 - momentum: 0.000000
2023-10-14 11:40:26,689 epoch 6 - iter 432/1445 - loss 0.03348796 - time (sec): 21.89 - samples/sec: 2380.18 - lr: 0.000026 - momentum: 0.000000
2023-10-14 11:40:34,468 epoch 6 - iter 576/1445 - loss 0.03487847 - time (sec): 29.66 - samples/sec: 2371.17 - lr: 0.000026 - momentum: 0.000000
2023-10-14 11:40:42,197 epoch 6 - iter 720/1445 - loss 0.03632731 - time (sec): 37.39 - samples/sec: 2393.53 - lr: 0.000025 - momentum: 0.000000
2023-10-14 11:40:49,599 epoch 6 - iter 864/1445 - loss 0.03568819 - time (sec): 44.80 - samples/sec: 2385.20 - lr: 0.000024 - momentum: 0.000000
2023-10-14 11:40:56,596 epoch 6 - iter 1008/1445 - loss 0.03343040 - time (sec): 51.79 - samples/sec: 2387.09 - lr: 0.000024 - momentum: 0.000000
2023-10-14 11:41:03,864 epoch 6 - iter 1152/1445 - loss 0.03169355 - time (sec): 59.06 - samples/sec: 2384.17 - lr: 0.000023 - momentum: 0.000000
2023-10-14 11:41:11,044 epoch 6 - iter 1296/1445 - loss 0.03126589 - time (sec): 66.24 - samples/sec: 2389.17 - lr: 0.000023 - momentum: 0.000000
2023-10-14 11:41:18,341 epoch 6 - iter 1440/1445 - loss 0.03080380 - time (sec): 73.54 - samples/sec: 2389.29 - lr: 0.000022 - momentum: 0.000000
2023-10-14 11:41:18,575 ----------------------------------------------------------------------------------------------------
2023-10-14 11:41:18,575 EPOCH 6 done: loss 0.0307 - lr: 0.000022
2023-10-14 11:41:22,171 DEV : loss 0.17948994040489197 - f1-score (micro avg) 0.7937
2023-10-14 11:41:22,187 ----------------------------------------------------------------------------------------------------
2023-10-14 11:41:29,386 epoch 7 - iter 144/1445 - loss 0.02148999 - time (sec): 7.20 - samples/sec: 2365.12 - lr: 0.000022 - momentum: 0.000000
2023-10-14 11:41:36,603 epoch 7 - iter 288/1445 - loss 0.02049358 - time (sec): 14.41 - samples/sec: 2374.86 - lr: 0.000021 - momentum: 0.000000
2023-10-14 11:41:44,061 epoch 7 - iter 432/1445 - loss 0.02426862 - time (sec): 21.87 - samples/sec: 2406.65 - lr: 0.000021 - momentum: 0.000000
2023-10-14 11:41:51,364 epoch 7 - iter 576/1445 - loss 0.02353802 - time (sec): 29.18 - samples/sec: 2400.22 - lr: 0.000020 - momentum: 0.000000
2023-10-14 11:41:58,671 epoch 7 - iter 720/1445 - loss 0.02213679 - time (sec): 36.48 - samples/sec: 2414.68 - lr: 0.000019 - momentum: 0.000000
2023-10-14 11:42:06,237 epoch 7 - iter 864/1445 - loss 0.02183153 - time (sec): 44.05 - samples/sec: 2392.78 - lr: 0.000019 - momentum: 0.000000
2023-10-14 11:42:13,794 epoch 7 - iter 1008/1445 - loss 0.02145289 - time (sec): 51.61 - samples/sec: 2383.07 - lr: 0.000018 - momentum: 0.000000
2023-10-14 11:42:21,361 epoch 7 - iter 1152/1445 - loss 0.02240382 - time (sec): 59.17 - samples/sec: 2376.42 - lr: 0.000018 - momentum: 0.000000
2023-10-14 11:42:28,790 epoch 7 - iter 1296/1445 - loss 0.02358361 - time (sec): 66.60 - samples/sec: 2359.36 - lr: 0.000017 - momentum: 0.000000
2023-10-14 11:42:37,066 epoch 7 - iter 1440/1445 - loss 0.02296526 - time (sec): 74.88 - samples/sec: 2346.33 - lr: 0.000017 - momentum: 0.000000
2023-10-14 11:42:37,342 ----------------------------------------------------------------------------------------------------
2023-10-14 11:42:37,342 EPOCH 7 done: loss 0.0229 - lr: 0.000017
2023-10-14 11:42:41,079 DEV : loss 0.17617355287075043 - f1-score (micro avg) 0.807
2023-10-14 11:42:41,102 saving best model
2023-10-14 11:42:41,666 ----------------------------------------------------------------------------------------------------
2023-10-14 11:42:49,496 epoch 8 - iter 144/1445 - loss 0.02117476 - time (sec): 7.83 - samples/sec: 2275.91 - lr: 0.000016 - momentum: 0.000000
2023-10-14 11:42:56,718 epoch 8 - iter 288/1445 - loss 0.01850677 - time (sec): 15.05 - samples/sec: 2336.14 - lr: 0.000016 - momentum: 0.000000
2023-10-14 11:43:04,263 epoch 8 - iter 432/1445 - loss 0.02027973 - time (sec): 22.59 - samples/sec: 2357.45 - lr: 0.000015 - momentum: 0.000000
2023-10-14 11:43:11,468 epoch 8 - iter 576/1445 - loss 0.01932100 - time (sec): 29.80 - samples/sec: 2369.27 - lr: 0.000014 - momentum: 0.000000
2023-10-14 11:43:18,867 epoch 8 - iter 720/1445 - loss 0.01835449 - time (sec): 37.20 - samples/sec: 2380.94 - lr: 0.000014 - momentum: 0.000000
2023-10-14 11:43:26,156 epoch 8 - iter 864/1445 - loss 0.01762142 - time (sec): 44.49 - samples/sec: 2382.22 - lr: 0.000013 - momentum: 0.000000
2023-10-14 11:43:33,446 epoch 8 - iter 1008/1445 - loss 0.01660096 - time (sec): 51.78 - samples/sec: 2363.32 - lr: 0.000013 - momentum: 0.000000
2023-10-14 11:43:40,921 epoch 8 - iter 1152/1445 - loss 0.01616639 - time (sec): 59.25 - samples/sec: 2368.87 - lr: 0.000012 - momentum: 0.000000
2023-10-14 11:43:48,521 epoch 8 - iter 1296/1445 - loss 0.01642043 - time (sec): 66.85 - samples/sec: 2371.35 - lr: 0.000012 - momentum: 0.000000
2023-10-14 11:43:55,716 epoch 8 - iter 1440/1445 - loss 0.01607383 - time (sec): 74.05 - samples/sec: 2369.57 - lr: 0.000011 - momentum: 0.000000
2023-10-14 11:43:56,008 ----------------------------------------------------------------------------------------------------
2023-10-14 11:43:56,008 EPOCH 8 done: loss 0.0160 - lr: 0.000011
2023-10-14 11:43:59,940 DEV : loss 0.20396527647972107 - f1-score (micro avg) 0.7911
2023-10-14 11:43:59,956 ----------------------------------------------------------------------------------------------------
2023-10-14 11:44:07,506 epoch 9 - iter 144/1445 - loss 0.00551300 - time (sec): 7.55 - samples/sec: 2309.36 - lr: 0.000011 - momentum: 0.000000
2023-10-14 11:44:14,806 epoch 9 - iter 288/1445 - loss 0.00752272 - time (sec): 14.85 - samples/sec: 2253.81 - lr: 0.000010 - momentum: 0.000000
2023-10-14 11:44:22,602 epoch 9 - iter 432/1445 - loss 0.01012645 - time (sec): 22.64 - samples/sec: 2368.69 - lr: 0.000009 - momentum: 0.000000
2023-10-14 11:44:29,669 epoch 9 - iter 576/1445 - loss 0.01015045 - time (sec): 29.71 - samples/sec: 2367.09 - lr: 0.000009 - momentum: 0.000000
2023-10-14 11:44:36,980 epoch 9 - iter 720/1445 - loss 0.00999283 - time (sec): 37.02 - samples/sec: 2387.63 - lr: 0.000008 - momentum: 0.000000
2023-10-14 11:44:44,297 epoch 9 - iter 864/1445 - loss 0.01004974 - time (sec): 44.34 - samples/sec: 2387.55 - lr: 0.000008 - momentum: 0.000000
2023-10-14 11:44:51,529 epoch 9 - iter 1008/1445 - loss 0.00938374 - time (sec): 51.57 - samples/sec: 2391.58 - lr: 0.000007 - momentum: 0.000000
2023-10-14 11:44:59,107 epoch 9 - iter 1152/1445 - loss 0.00993299 - time (sec): 59.15 - samples/sec: 2373.67 - lr: 0.000007 - momentum: 0.000000
2023-10-14 11:45:06,864 epoch 9 - iter 1296/1445 - loss 0.01035581 - time (sec): 66.91 - samples/sec: 2357.53 - lr: 0.000006 - momentum: 0.000000
2023-10-14 11:45:14,195 epoch 9 - iter 1440/1445 - loss 0.01097736 - time (sec): 74.24 - samples/sec: 2363.80 - lr: 0.000006 - momentum: 0.000000
2023-10-14 11:45:14,464 ----------------------------------------------------------------------------------------------------
2023-10-14 11:45:14,464 EPOCH 9 done: loss 0.0109 - lr: 0.000006
2023-10-14 11:45:17,986 DEV : loss 0.18760253489017487 - f1-score (micro avg) 0.8144
2023-10-14 11:45:18,004 saving best model
2023-10-14 11:45:18,565 ----------------------------------------------------------------------------------------------------
2023-10-14 11:45:26,028 epoch 10 - iter 144/1445 - loss 0.00921785 - time (sec): 7.46 - samples/sec: 2427.67 - lr: 0.000005 - momentum: 0.000000
2023-10-14 11:45:33,349 epoch 10 - iter 288/1445 - loss 0.00998994 - time (sec): 14.78 - samples/sec: 2411.98 - lr: 0.000004 - momentum: 0.000000
2023-10-14 11:45:40,783 epoch 10 - iter 432/1445 - loss 0.01085145 - time (sec): 22.21 - samples/sec: 2398.82 - lr: 0.000004 - momentum: 0.000000
2023-10-14 11:45:48,868 epoch 10 - iter 576/1445 - loss 0.00896974 - time (sec): 30.30 - samples/sec: 2370.81 - lr: 0.000003 - momentum: 0.000000
2023-10-14 11:45:56,076 epoch 10 - iter 720/1445 - loss 0.00775903 - time (sec): 37.51 - samples/sec: 2379.75 - lr: 0.000003 - momentum: 0.000000
2023-10-14 11:46:03,265 epoch 10 - iter 864/1445 - loss 0.00776058 - time (sec): 44.69 - samples/sec: 2389.41 - lr: 0.000002 - momentum: 0.000000
2023-10-14 11:46:10,672 epoch 10 - iter 1008/1445 - loss 0.00775701 - time (sec): 52.10 - samples/sec: 2385.87 - lr: 0.000002 - momentum: 0.000000
2023-10-14 11:46:17,927 epoch 10 - iter 1152/1445 - loss 0.00812453 - time (sec): 59.36 - samples/sec: 2383.52 - lr: 0.000001 - momentum: 0.000000
2023-10-14 11:46:25,194 epoch 10 - iter 1296/1445 - loss 0.00784918 - time (sec): 66.62 - samples/sec: 2372.11 - lr: 0.000001 - momentum: 0.000000
2023-10-14 11:46:32,424 epoch 10 - iter 1440/1445 - loss 0.00772587 - time (sec): 73.85 - samples/sec: 2377.16 - lr: 0.000000 - momentum: 0.000000
2023-10-14 11:46:32,703 ----------------------------------------------------------------------------------------------------
2023-10-14 11:46:32,704 EPOCH 10 done: loss 0.0077 - lr: 0.000000
2023-10-14 11:46:36,251 DEV : loss 0.1970515102148056 - f1-score (micro avg) 0.806
2023-10-14 11:46:36,663 ----------------------------------------------------------------------------------------------------
2023-10-14 11:46:36,665 Loading model from best epoch ...
2023-10-14 11:46:38,496 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-14 11:46:42,384
Results:
- F-score (micro) 0.8044
- F-score (macro) 0.6972
- Accuracy 0.6823
By class:
precision recall f1-score support
PER 0.8552 0.7842 0.8182 482
LOC 0.8819 0.7991 0.8385 458
ORG 0.5435 0.3623 0.4348 69
micro avg 0.8516 0.7621 0.8044 1009
macro avg 0.7602 0.6486 0.6972 1009
weighted avg 0.8460 0.7621 0.8012 1009
2023-10-14 11:46:42,384 ----------------------------------------------------------------------------------------------------