stefan-it's picture
Upload folder using huggingface_hub
d38962c
2023-10-17 18:24:39,029 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:39,030 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 18:24:39,030 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:39,030 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-17 18:24:39,030 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:39,030 Train: 5777 sentences
2023-10-17 18:24:39,030 (train_with_dev=False, train_with_test=False)
2023-10-17 18:24:39,030 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:39,030 Training Params:
2023-10-17 18:24:39,030 - learning_rate: "3e-05"
2023-10-17 18:24:39,030 - mini_batch_size: "4"
2023-10-17 18:24:39,030 - max_epochs: "10"
2023-10-17 18:24:39,030 - shuffle: "True"
2023-10-17 18:24:39,030 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:39,030 Plugins:
2023-10-17 18:24:39,030 - TensorboardLogger
2023-10-17 18:24:39,030 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 18:24:39,030 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:39,030 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 18:24:39,030 - metric: "('micro avg', 'f1-score')"
2023-10-17 18:24:39,030 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:39,030 Computation:
2023-10-17 18:24:39,030 - compute on device: cuda:0
2023-10-17 18:24:39,031 - embedding storage: none
2023-10-17 18:24:39,031 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:39,031 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-17 18:24:39,031 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:39,031 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:39,031 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 18:24:46,064 epoch 1 - iter 144/1445 - loss 2.40249140 - time (sec): 7.03 - samples/sec: 2391.55 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:24:53,424 epoch 1 - iter 288/1445 - loss 1.36473047 - time (sec): 14.39 - samples/sec: 2360.01 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:25:00,668 epoch 1 - iter 432/1445 - loss 0.96670832 - time (sec): 21.64 - samples/sec: 2368.22 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:25:07,775 epoch 1 - iter 576/1445 - loss 0.75923383 - time (sec): 28.74 - samples/sec: 2399.17 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:25:14,606 epoch 1 - iter 720/1445 - loss 0.63444395 - time (sec): 35.57 - samples/sec: 2424.50 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:25:21,617 epoch 1 - iter 864/1445 - loss 0.54802313 - time (sec): 42.59 - samples/sec: 2450.41 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:25:28,797 epoch 1 - iter 1008/1445 - loss 0.49139637 - time (sec): 49.77 - samples/sec: 2451.68 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:25:35,915 epoch 1 - iter 1152/1445 - loss 0.44471338 - time (sec): 56.88 - samples/sec: 2457.05 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:25:42,772 epoch 1 - iter 1296/1445 - loss 0.40703382 - time (sec): 63.74 - samples/sec: 2471.56 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:25:50,326 epoch 1 - iter 1440/1445 - loss 0.37927457 - time (sec): 71.29 - samples/sec: 2461.52 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:25:50,587 ----------------------------------------------------------------------------------------------------
2023-10-17 18:25:50,587 EPOCH 1 done: loss 0.3781 - lr: 0.000030
2023-10-17 18:25:53,549 DEV : loss 0.08337056636810303 - f1-score (micro avg) 0.7966
2023-10-17 18:25:53,568 saving best model
2023-10-17 18:25:53,937 ----------------------------------------------------------------------------------------------------
2023-10-17 18:26:01,081 epoch 2 - iter 144/1445 - loss 0.09997945 - time (sec): 7.14 - samples/sec: 2323.79 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:26:08,349 epoch 2 - iter 288/1445 - loss 0.10050675 - time (sec): 14.41 - samples/sec: 2377.08 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:26:15,230 epoch 2 - iter 432/1445 - loss 0.09758629 - time (sec): 21.29 - samples/sec: 2429.50 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:26:22,200 epoch 2 - iter 576/1445 - loss 0.10550488 - time (sec): 28.26 - samples/sec: 2439.70 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:26:29,525 epoch 2 - iter 720/1445 - loss 0.09799138 - time (sec): 35.58 - samples/sec: 2465.77 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:26:36,921 epoch 2 - iter 864/1445 - loss 0.09518587 - time (sec): 42.98 - samples/sec: 2494.97 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:26:43,832 epoch 2 - iter 1008/1445 - loss 0.09965579 - time (sec): 49.89 - samples/sec: 2465.33 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:26:50,996 epoch 2 - iter 1152/1445 - loss 0.09757907 - time (sec): 57.05 - samples/sec: 2484.80 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:26:58,119 epoch 2 - iter 1296/1445 - loss 0.09642403 - time (sec): 64.18 - samples/sec: 2466.31 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:27:05,384 epoch 2 - iter 1440/1445 - loss 0.09419689 - time (sec): 71.44 - samples/sec: 2457.35 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:27:05,610 ----------------------------------------------------------------------------------------------------
2023-10-17 18:27:05,610 EPOCH 2 done: loss 0.0941 - lr: 0.000027
2023-10-17 18:27:09,438 DEV : loss 0.059414949268102646 - f1-score (micro avg) 0.8677
2023-10-17 18:27:09,455 saving best model
2023-10-17 18:27:09,918 ----------------------------------------------------------------------------------------------------
2023-10-17 18:27:16,714 epoch 3 - iter 144/1445 - loss 0.05939246 - time (sec): 6.79 - samples/sec: 2548.39 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:27:24,383 epoch 3 - iter 288/1445 - loss 0.06503883 - time (sec): 14.46 - samples/sec: 2347.37 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:27:31,793 epoch 3 - iter 432/1445 - loss 0.07206773 - time (sec): 21.87 - samples/sec: 2381.75 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:27:38,996 epoch 3 - iter 576/1445 - loss 0.06976120 - time (sec): 29.07 - samples/sec: 2410.64 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:27:46,120 epoch 3 - iter 720/1445 - loss 0.06784493 - time (sec): 36.20 - samples/sec: 2440.24 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:27:53,122 epoch 3 - iter 864/1445 - loss 0.07022398 - time (sec): 43.20 - samples/sec: 2446.68 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:28:00,718 epoch 3 - iter 1008/1445 - loss 0.07026795 - time (sec): 50.80 - samples/sec: 2419.53 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:28:08,051 epoch 3 - iter 1152/1445 - loss 0.06785824 - time (sec): 58.13 - samples/sec: 2435.02 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:28:15,597 epoch 3 - iter 1296/1445 - loss 0.06757860 - time (sec): 65.67 - samples/sec: 2414.48 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:28:23,280 epoch 3 - iter 1440/1445 - loss 0.06798115 - time (sec): 73.36 - samples/sec: 2393.59 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:28:23,534 ----------------------------------------------------------------------------------------------------
2023-10-17 18:28:23,534 EPOCH 3 done: loss 0.0679 - lr: 0.000023
2023-10-17 18:28:26,900 DEV : loss 0.07125767320394516 - f1-score (micro avg) 0.8778
2023-10-17 18:28:26,918 saving best model
2023-10-17 18:28:27,479 ----------------------------------------------------------------------------------------------------
2023-10-17 18:28:34,657 epoch 4 - iter 144/1445 - loss 0.03823889 - time (sec): 7.17 - samples/sec: 2571.13 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:28:42,028 epoch 4 - iter 288/1445 - loss 0.04172689 - time (sec): 14.54 - samples/sec: 2513.34 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:28:48,844 epoch 4 - iter 432/1445 - loss 0.04752585 - time (sec): 21.35 - samples/sec: 2481.42 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:28:55,928 epoch 4 - iter 576/1445 - loss 0.05232222 - time (sec): 28.44 - samples/sec: 2491.59 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:29:02,810 epoch 4 - iter 720/1445 - loss 0.05199722 - time (sec): 35.32 - samples/sec: 2472.85 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:29:09,858 epoch 4 - iter 864/1445 - loss 0.05516103 - time (sec): 42.37 - samples/sec: 2472.30 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:29:16,970 epoch 4 - iter 1008/1445 - loss 0.05469799 - time (sec): 49.48 - samples/sec: 2479.29 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:29:24,308 epoch 4 - iter 1152/1445 - loss 0.05442157 - time (sec): 56.82 - samples/sec: 2468.09 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:29:31,541 epoch 4 - iter 1296/1445 - loss 0.05372314 - time (sec): 64.05 - samples/sec: 2458.54 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:29:38,985 epoch 4 - iter 1440/1445 - loss 0.05417645 - time (sec): 71.49 - samples/sec: 2457.99 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:29:39,246 ----------------------------------------------------------------------------------------------------
2023-10-17 18:29:39,247 EPOCH 4 done: loss 0.0541 - lr: 0.000020
2023-10-17 18:29:42,664 DEV : loss 0.10595724731683731 - f1-score (micro avg) 0.8568
2023-10-17 18:29:42,685 ----------------------------------------------------------------------------------------------------
2023-10-17 18:29:50,238 epoch 5 - iter 144/1445 - loss 0.03925321 - time (sec): 7.55 - samples/sec: 2234.30 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:29:57,133 epoch 5 - iter 288/1445 - loss 0.03820912 - time (sec): 14.45 - samples/sec: 2291.33 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:30:04,590 epoch 5 - iter 432/1445 - loss 0.03770167 - time (sec): 21.90 - samples/sec: 2391.99 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:30:11,483 epoch 5 - iter 576/1445 - loss 0.03620886 - time (sec): 28.80 - samples/sec: 2405.91 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:30:18,766 epoch 5 - iter 720/1445 - loss 0.03355020 - time (sec): 36.08 - samples/sec: 2403.33 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:30:25,813 epoch 5 - iter 864/1445 - loss 0.03410430 - time (sec): 43.13 - samples/sec: 2426.47 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:30:33,029 epoch 5 - iter 1008/1445 - loss 0.03521635 - time (sec): 50.34 - samples/sec: 2439.19 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:30:39,998 epoch 5 - iter 1152/1445 - loss 0.03751550 - time (sec): 57.31 - samples/sec: 2448.07 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:30:46,953 epoch 5 - iter 1296/1445 - loss 0.03804211 - time (sec): 64.27 - samples/sec: 2447.82 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:30:53,948 epoch 5 - iter 1440/1445 - loss 0.03773588 - time (sec): 71.26 - samples/sec: 2467.64 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:30:54,165 ----------------------------------------------------------------------------------------------------
2023-10-17 18:30:54,165 EPOCH 5 done: loss 0.0378 - lr: 0.000017
2023-10-17 18:30:57,498 DEV : loss 0.1191408783197403 - f1-score (micro avg) 0.8652
2023-10-17 18:30:57,516 ----------------------------------------------------------------------------------------------------
2023-10-17 18:31:04,570 epoch 6 - iter 144/1445 - loss 0.01536583 - time (sec): 7.05 - samples/sec: 2581.94 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:31:11,647 epoch 6 - iter 288/1445 - loss 0.01909159 - time (sec): 14.13 - samples/sec: 2539.44 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:31:18,546 epoch 6 - iter 432/1445 - loss 0.02633944 - time (sec): 21.03 - samples/sec: 2550.13 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:31:25,686 epoch 6 - iter 576/1445 - loss 0.02999207 - time (sec): 28.17 - samples/sec: 2528.23 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:31:32,920 epoch 6 - iter 720/1445 - loss 0.03103521 - time (sec): 35.40 - samples/sec: 2533.15 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:31:40,138 epoch 6 - iter 864/1445 - loss 0.03041336 - time (sec): 42.62 - samples/sec: 2521.88 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:31:47,125 epoch 6 - iter 1008/1445 - loss 0.03013301 - time (sec): 49.61 - samples/sec: 2514.36 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:31:54,126 epoch 6 - iter 1152/1445 - loss 0.02977871 - time (sec): 56.61 - samples/sec: 2495.66 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:32:01,353 epoch 6 - iter 1296/1445 - loss 0.02885957 - time (sec): 63.84 - samples/sec: 2485.56 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:32:08,036 epoch 6 - iter 1440/1445 - loss 0.02830552 - time (sec): 70.52 - samples/sec: 2492.18 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:32:08,274 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:08,275 EPOCH 6 done: loss 0.0283 - lr: 0.000013
2023-10-17 18:32:11,731 DEV : loss 0.1016329899430275 - f1-score (micro avg) 0.8851
2023-10-17 18:32:11,750 saving best model
2023-10-17 18:32:12,213 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:19,813 epoch 7 - iter 144/1445 - loss 0.01283763 - time (sec): 7.60 - samples/sec: 2326.72 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:32:27,511 epoch 7 - iter 288/1445 - loss 0.02088703 - time (sec): 15.30 - samples/sec: 2243.04 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:32:35,883 epoch 7 - iter 432/1445 - loss 0.02183665 - time (sec): 23.67 - samples/sec: 2242.34 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:32:43,093 epoch 7 - iter 576/1445 - loss 0.02240611 - time (sec): 30.88 - samples/sec: 2318.86 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:32:50,374 epoch 7 - iter 720/1445 - loss 0.02292437 - time (sec): 38.16 - samples/sec: 2347.37 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:32:57,550 epoch 7 - iter 864/1445 - loss 0.02185865 - time (sec): 45.34 - samples/sec: 2364.34 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:33:04,627 epoch 7 - iter 1008/1445 - loss 0.02156408 - time (sec): 52.41 - samples/sec: 2375.00 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:33:11,457 epoch 7 - iter 1152/1445 - loss 0.02042531 - time (sec): 59.24 - samples/sec: 2390.81 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:33:18,355 epoch 7 - iter 1296/1445 - loss 0.02042436 - time (sec): 66.14 - samples/sec: 2396.48 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:33:25,535 epoch 7 - iter 1440/1445 - loss 0.02033836 - time (sec): 73.32 - samples/sec: 2396.58 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:33:25,757 ----------------------------------------------------------------------------------------------------
2023-10-17 18:33:25,757 EPOCH 7 done: loss 0.0204 - lr: 0.000010
2023-10-17 18:33:29,168 DEV : loss 0.1190648004412651 - f1-score (micro avg) 0.8655
2023-10-17 18:33:29,188 ----------------------------------------------------------------------------------------------------
2023-10-17 18:33:36,420 epoch 8 - iter 144/1445 - loss 0.00857818 - time (sec): 7.23 - samples/sec: 2455.99 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:33:43,376 epoch 8 - iter 288/1445 - loss 0.01132420 - time (sec): 14.19 - samples/sec: 2487.91 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:33:50,188 epoch 8 - iter 432/1445 - loss 0.01072450 - time (sec): 21.00 - samples/sec: 2477.46 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:33:57,282 epoch 8 - iter 576/1445 - loss 0.01112080 - time (sec): 28.09 - samples/sec: 2474.40 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:34:04,320 epoch 8 - iter 720/1445 - loss 0.01113390 - time (sec): 35.13 - samples/sec: 2453.70 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:34:11,554 epoch 8 - iter 864/1445 - loss 0.01168471 - time (sec): 42.37 - samples/sec: 2449.12 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:34:18,690 epoch 8 - iter 1008/1445 - loss 0.01211707 - time (sec): 49.50 - samples/sec: 2443.14 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:34:26,217 epoch 8 - iter 1152/1445 - loss 0.01308008 - time (sec): 57.03 - samples/sec: 2458.92 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:34:33,244 epoch 8 - iter 1296/1445 - loss 0.01348119 - time (sec): 64.06 - samples/sec: 2458.87 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:34:40,430 epoch 8 - iter 1440/1445 - loss 0.01338325 - time (sec): 71.24 - samples/sec: 2464.52 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:34:40,677 ----------------------------------------------------------------------------------------------------
2023-10-17 18:34:40,678 EPOCH 8 done: loss 0.0134 - lr: 0.000007
2023-10-17 18:34:44,135 DEV : loss 0.1326545625925064 - f1-score (micro avg) 0.8661
2023-10-17 18:34:44,159 ----------------------------------------------------------------------------------------------------
2023-10-17 18:34:51,536 epoch 9 - iter 144/1445 - loss 0.00677100 - time (sec): 7.37 - samples/sec: 2388.08 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:34:58,666 epoch 9 - iter 288/1445 - loss 0.00589925 - time (sec): 14.51 - samples/sec: 2496.19 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:35:05,566 epoch 9 - iter 432/1445 - loss 0.00710020 - time (sec): 21.40 - samples/sec: 2473.34 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:35:12,669 epoch 9 - iter 576/1445 - loss 0.00726776 - time (sec): 28.51 - samples/sec: 2472.61 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:35:19,808 epoch 9 - iter 720/1445 - loss 0.00665823 - time (sec): 35.65 - samples/sec: 2484.41 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:35:27,339 epoch 9 - iter 864/1445 - loss 0.00694911 - time (sec): 43.18 - samples/sec: 2452.15 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:35:34,823 epoch 9 - iter 1008/1445 - loss 0.00785515 - time (sec): 50.66 - samples/sec: 2461.29 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:35:41,857 epoch 9 - iter 1152/1445 - loss 0.00791889 - time (sec): 57.70 - samples/sec: 2456.03 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:35:48,930 epoch 9 - iter 1296/1445 - loss 0.00819709 - time (sec): 64.77 - samples/sec: 2468.17 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:35:55,665 epoch 9 - iter 1440/1445 - loss 0.00824064 - time (sec): 71.50 - samples/sec: 2458.58 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:35:55,884 ----------------------------------------------------------------------------------------------------
2023-10-17 18:35:55,885 EPOCH 9 done: loss 0.0082 - lr: 0.000003
2023-10-17 18:35:59,265 DEV : loss 0.13092152774333954 - f1-score (micro avg) 0.8755
2023-10-17 18:35:59,283 ----------------------------------------------------------------------------------------------------
2023-10-17 18:36:06,634 epoch 10 - iter 144/1445 - loss 0.00800238 - time (sec): 7.35 - samples/sec: 2446.53 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:36:13,521 epoch 10 - iter 288/1445 - loss 0.00496623 - time (sec): 14.24 - samples/sec: 2460.81 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:36:20,659 epoch 10 - iter 432/1445 - loss 0.00568248 - time (sec): 21.37 - samples/sec: 2474.86 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:36:27,805 epoch 10 - iter 576/1445 - loss 0.00602579 - time (sec): 28.52 - samples/sec: 2476.23 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:36:34,747 epoch 10 - iter 720/1445 - loss 0.00617050 - time (sec): 35.46 - samples/sec: 2477.34 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:36:41,884 epoch 10 - iter 864/1445 - loss 0.00622325 - time (sec): 42.60 - samples/sec: 2489.90 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:36:48,941 epoch 10 - iter 1008/1445 - loss 0.00593060 - time (sec): 49.66 - samples/sec: 2477.93 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:36:55,998 epoch 10 - iter 1152/1445 - loss 0.00607493 - time (sec): 56.71 - samples/sec: 2466.32 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:37:02,996 epoch 10 - iter 1296/1445 - loss 0.00609370 - time (sec): 63.71 - samples/sec: 2476.51 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:37:09,948 epoch 10 - iter 1440/1445 - loss 0.00604769 - time (sec): 70.66 - samples/sec: 2488.21 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:37:10,161 ----------------------------------------------------------------------------------------------------
2023-10-17 18:37:10,162 EPOCH 10 done: loss 0.0060 - lr: 0.000000
2023-10-17 18:37:13,655 DEV : loss 0.1363741010427475 - f1-score (micro avg) 0.877
2023-10-17 18:37:14,123 ----------------------------------------------------------------------------------------------------
2023-10-17 18:37:14,124 Loading model from best epoch ...
2023-10-17 18:37:15,533 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 18:37:18,597
Results:
- F-score (micro) 0.8566
- F-score (macro) 0.7724
- Accuracy 0.7604
By class:
precision recall f1-score support
PER 0.8104 0.8693 0.8388 482
LOC 0.9320 0.8974 0.9143 458
ORG 0.6875 0.4783 0.5641 69
micro avg 0.8579 0.8553 0.8566 1009
macro avg 0.8100 0.7483 0.7724 1009
weighted avg 0.8572 0.8553 0.8543 1009
2023-10-17 18:37:18,597 ----------------------------------------------------------------------------------------------------