flair-icdar-nl / training.log
stefan-it's picture
Upload folder using huggingface_hub
19e3b1f
2023-10-17 18:04:35,825 ----------------------------------------------------------------------------------------------------
2023-10-17 18:04:35,826 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 18:04:35,826 ----------------------------------------------------------------------------------------------------
2023-10-17 18:04:35,826 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-17 18:04:35,826 ----------------------------------------------------------------------------------------------------
2023-10-17 18:04:35,826 Train: 5777 sentences
2023-10-17 18:04:35,826 (train_with_dev=False, train_with_test=False)
2023-10-17 18:04:35,826 ----------------------------------------------------------------------------------------------------
2023-10-17 18:04:35,826 Training Params:
2023-10-17 18:04:35,826 - learning_rate: "3e-05"
2023-10-17 18:04:35,826 - mini_batch_size: "8"
2023-10-17 18:04:35,826 - max_epochs: "10"
2023-10-17 18:04:35,826 - shuffle: "True"
2023-10-17 18:04:35,826 ----------------------------------------------------------------------------------------------------
2023-10-17 18:04:35,826 Plugins:
2023-10-17 18:04:35,826 - TensorboardLogger
2023-10-17 18:04:35,826 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 18:04:35,827 ----------------------------------------------------------------------------------------------------
2023-10-17 18:04:35,827 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 18:04:35,827 - metric: "('micro avg', 'f1-score')"
2023-10-17 18:04:35,827 ----------------------------------------------------------------------------------------------------
2023-10-17 18:04:35,827 Computation:
2023-10-17 18:04:35,827 - compute on device: cuda:0
2023-10-17 18:04:35,827 - embedding storage: none
2023-10-17 18:04:35,827 ----------------------------------------------------------------------------------------------------
2023-10-17 18:04:35,827 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 18:04:35,827 ----------------------------------------------------------------------------------------------------
2023-10-17 18:04:35,827 ----------------------------------------------------------------------------------------------------
2023-10-17 18:04:35,827 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 18:04:40,869 epoch 1 - iter 72/723 - loss 3.14780265 - time (sec): 5.04 - samples/sec: 3212.85 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:04:46,354 epoch 1 - iter 144/723 - loss 1.91689024 - time (sec): 10.53 - samples/sec: 3248.38 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:04:51,606 epoch 1 - iter 216/723 - loss 1.35009273 - time (sec): 15.78 - samples/sec: 3276.04 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:04:56,946 epoch 1 - iter 288/723 - loss 1.05729540 - time (sec): 21.12 - samples/sec: 3274.92 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:05:02,237 epoch 1 - iter 360/723 - loss 0.87211949 - time (sec): 26.41 - samples/sec: 3299.80 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:05:07,703 epoch 1 - iter 432/723 - loss 0.74573597 - time (sec): 31.87 - samples/sec: 3312.62 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:05:12,988 epoch 1 - iter 504/723 - loss 0.65737284 - time (sec): 37.16 - samples/sec: 3318.84 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:05:18,380 epoch 1 - iter 576/723 - loss 0.59117923 - time (sec): 42.55 - samples/sec: 3318.60 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:05:24,033 epoch 1 - iter 648/723 - loss 0.53883556 - time (sec): 48.20 - samples/sec: 3299.42 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:05:28,712 epoch 1 - iter 720/723 - loss 0.49808664 - time (sec): 52.88 - samples/sec: 3321.46 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:05:28,893 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:28,894 EPOCH 1 done: loss 0.4968 - lr: 0.000030
2023-10-17 18:05:32,615 DEV : loss 0.10252858698368073 - f1-score (micro avg) 0.6435
2023-10-17 18:05:32,634 saving best model
2023-10-17 18:05:33,009 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:38,412 epoch 2 - iter 72/723 - loss 0.08824932 - time (sec): 5.40 - samples/sec: 3445.98 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:05:43,268 epoch 2 - iter 144/723 - loss 0.09284165 - time (sec): 10.26 - samples/sec: 3459.67 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:05:49,301 epoch 2 - iter 216/723 - loss 0.09672994 - time (sec): 16.29 - samples/sec: 3286.52 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:05:54,557 epoch 2 - iter 288/723 - loss 0.09200487 - time (sec): 21.55 - samples/sec: 3302.04 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:05:59,668 epoch 2 - iter 360/723 - loss 0.09466436 - time (sec): 26.66 - samples/sec: 3295.03 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:06:04,651 epoch 2 - iter 432/723 - loss 0.09512636 - time (sec): 31.64 - samples/sec: 3301.25 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:06:10,475 epoch 2 - iter 504/723 - loss 0.09272504 - time (sec): 37.47 - samples/sec: 3290.58 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:06:15,387 epoch 2 - iter 576/723 - loss 0.09030789 - time (sec): 42.38 - samples/sec: 3293.41 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:06:20,677 epoch 2 - iter 648/723 - loss 0.08892716 - time (sec): 47.67 - samples/sec: 3294.64 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:06:26,199 epoch 2 - iter 720/723 - loss 0.08670549 - time (sec): 53.19 - samples/sec: 3305.28 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:06:26,360 ----------------------------------------------------------------------------------------------------
2023-10-17 18:06:26,360 EPOCH 2 done: loss 0.0868 - lr: 0.000027
2023-10-17 18:06:29,657 DEV : loss 0.07472483813762665 - f1-score (micro avg) 0.8086
2023-10-17 18:06:29,676 saving best model
2023-10-17 18:06:30,161 ----------------------------------------------------------------------------------------------------
2023-10-17 18:06:35,368 epoch 3 - iter 72/723 - loss 0.05396348 - time (sec): 5.21 - samples/sec: 3359.73 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:06:40,571 epoch 3 - iter 144/723 - loss 0.05386152 - time (sec): 10.41 - samples/sec: 3348.20 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:06:45,548 epoch 3 - iter 216/723 - loss 0.05763223 - time (sec): 15.39 - samples/sec: 3351.57 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:06:50,644 epoch 3 - iter 288/723 - loss 0.05871276 - time (sec): 20.48 - samples/sec: 3338.22 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:06:56,207 epoch 3 - iter 360/723 - loss 0.05735120 - time (sec): 26.05 - samples/sec: 3312.70 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:07:02,101 epoch 3 - iter 432/723 - loss 0.05744013 - time (sec): 31.94 - samples/sec: 3320.41 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:07:07,544 epoch 3 - iter 504/723 - loss 0.05703734 - time (sec): 37.38 - samples/sec: 3329.27 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:07:12,605 epoch 3 - iter 576/723 - loss 0.05670689 - time (sec): 42.44 - samples/sec: 3338.18 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:07:17,846 epoch 3 - iter 648/723 - loss 0.05681013 - time (sec): 47.68 - samples/sec: 3330.06 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:07:23,059 epoch 3 - iter 720/723 - loss 0.05774106 - time (sec): 52.90 - samples/sec: 3317.61 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:07:23,314 ----------------------------------------------------------------------------------------------------
2023-10-17 18:07:23,314 EPOCH 3 done: loss 0.0578 - lr: 0.000023
2023-10-17 18:07:27,112 DEV : loss 0.05712844431400299 - f1-score (micro avg) 0.8812
2023-10-17 18:07:27,129 saving best model
2023-10-17 18:07:27,652 ----------------------------------------------------------------------------------------------------
2023-10-17 18:07:32,710 epoch 4 - iter 72/723 - loss 0.03483535 - time (sec): 5.06 - samples/sec: 3414.70 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:07:37,907 epoch 4 - iter 144/723 - loss 0.03934497 - time (sec): 10.25 - samples/sec: 3374.65 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:07:43,328 epoch 4 - iter 216/723 - loss 0.03886039 - time (sec): 15.67 - samples/sec: 3339.03 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:07:48,534 epoch 4 - iter 288/723 - loss 0.03985048 - time (sec): 20.88 - samples/sec: 3333.40 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:07:53,639 epoch 4 - iter 360/723 - loss 0.03887291 - time (sec): 25.98 - samples/sec: 3306.91 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:07:59,171 epoch 4 - iter 432/723 - loss 0.04024768 - time (sec): 31.52 - samples/sec: 3306.96 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:08:04,658 epoch 4 - iter 504/723 - loss 0.04285831 - time (sec): 37.00 - samples/sec: 3311.61 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:08:09,714 epoch 4 - iter 576/723 - loss 0.04317139 - time (sec): 42.06 - samples/sec: 3314.94 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:08:14,986 epoch 4 - iter 648/723 - loss 0.04227368 - time (sec): 47.33 - samples/sec: 3321.28 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:08:20,439 epoch 4 - iter 720/723 - loss 0.04245703 - time (sec): 52.78 - samples/sec: 3326.15 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:08:20,628 ----------------------------------------------------------------------------------------------------
2023-10-17 18:08:20,628 EPOCH 4 done: loss 0.0424 - lr: 0.000020
2023-10-17 18:08:23,988 DEV : loss 0.06662245094776154 - f1-score (micro avg) 0.8791
2023-10-17 18:08:24,007 ----------------------------------------------------------------------------------------------------
2023-10-17 18:08:29,324 epoch 5 - iter 72/723 - loss 0.01990662 - time (sec): 5.32 - samples/sec: 3344.69 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:08:35,095 epoch 5 - iter 144/723 - loss 0.02279437 - time (sec): 11.09 - samples/sec: 3189.92 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:08:40,075 epoch 5 - iter 216/723 - loss 0.02479149 - time (sec): 16.07 - samples/sec: 3231.66 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:08:45,591 epoch 5 - iter 288/723 - loss 0.02719107 - time (sec): 21.58 - samples/sec: 3262.26 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:08:50,464 epoch 5 - iter 360/723 - loss 0.02575671 - time (sec): 26.46 - samples/sec: 3295.06 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:08:55,631 epoch 5 - iter 432/723 - loss 0.02771082 - time (sec): 31.62 - samples/sec: 3302.82 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:09:00,979 epoch 5 - iter 504/723 - loss 0.02956349 - time (sec): 36.97 - samples/sec: 3277.40 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:09:06,490 epoch 5 - iter 576/723 - loss 0.03084776 - time (sec): 42.48 - samples/sec: 3270.69 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:09:11,875 epoch 5 - iter 648/723 - loss 0.03111872 - time (sec): 47.87 - samples/sec: 3273.52 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:09:17,582 epoch 5 - iter 720/723 - loss 0.03213599 - time (sec): 53.57 - samples/sec: 3278.68 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:09:17,791 ----------------------------------------------------------------------------------------------------
2023-10-17 18:09:17,791 EPOCH 5 done: loss 0.0322 - lr: 0.000017
2023-10-17 18:09:21,121 DEV : loss 0.10735854506492615 - f1-score (micro avg) 0.852
2023-10-17 18:09:21,139 ----------------------------------------------------------------------------------------------------
2023-10-17 18:09:26,445 epoch 6 - iter 72/723 - loss 0.01459516 - time (sec): 5.30 - samples/sec: 3243.72 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:09:31,377 epoch 6 - iter 144/723 - loss 0.01916253 - time (sec): 10.24 - samples/sec: 3349.32 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:09:36,924 epoch 6 - iter 216/723 - loss 0.02024065 - time (sec): 15.78 - samples/sec: 3321.70 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:09:41,719 epoch 6 - iter 288/723 - loss 0.01952469 - time (sec): 20.58 - samples/sec: 3309.42 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:09:47,484 epoch 6 - iter 360/723 - loss 0.02084252 - time (sec): 26.34 - samples/sec: 3323.30 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:09:52,279 epoch 6 - iter 432/723 - loss 0.02140543 - time (sec): 31.14 - samples/sec: 3382.54 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:09:57,342 epoch 6 - iter 504/723 - loss 0.02072648 - time (sec): 36.20 - samples/sec: 3382.64 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:10:02,892 epoch 6 - iter 576/723 - loss 0.02209234 - time (sec): 41.75 - samples/sec: 3391.15 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:10:07,864 epoch 6 - iter 648/723 - loss 0.02271147 - time (sec): 46.72 - samples/sec: 3387.47 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:10:13,045 epoch 6 - iter 720/723 - loss 0.02324330 - time (sec): 51.91 - samples/sec: 3386.73 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:10:13,187 ----------------------------------------------------------------------------------------------------
2023-10-17 18:10:13,187 EPOCH 6 done: loss 0.0233 - lr: 0.000013
2023-10-17 18:10:16,951 DEV : loss 0.09898053109645844 - f1-score (micro avg) 0.8787
2023-10-17 18:10:16,973 ----------------------------------------------------------------------------------------------------
2023-10-17 18:10:22,091 epoch 7 - iter 72/723 - loss 0.01179340 - time (sec): 5.12 - samples/sec: 3273.92 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:10:27,430 epoch 7 - iter 144/723 - loss 0.01110712 - time (sec): 10.46 - samples/sec: 3283.64 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:10:32,925 epoch 7 - iter 216/723 - loss 0.01314864 - time (sec): 15.95 - samples/sec: 3319.64 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:10:38,240 epoch 7 - iter 288/723 - loss 0.01571089 - time (sec): 21.27 - samples/sec: 3304.77 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:10:43,718 epoch 7 - iter 360/723 - loss 0.01630848 - time (sec): 26.74 - samples/sec: 3311.36 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:10:49,005 epoch 7 - iter 432/723 - loss 0.01910391 - time (sec): 32.03 - samples/sec: 3309.95 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:10:54,705 epoch 7 - iter 504/723 - loss 0.01877318 - time (sec): 37.73 - samples/sec: 3310.78 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:10:59,635 epoch 7 - iter 576/723 - loss 0.01811978 - time (sec): 42.66 - samples/sec: 3315.06 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:11:04,812 epoch 7 - iter 648/723 - loss 0.01804125 - time (sec): 47.84 - samples/sec: 3320.49 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:11:10,018 epoch 7 - iter 720/723 - loss 0.01810548 - time (sec): 53.04 - samples/sec: 3312.43 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:11:10,198 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:10,199 EPOCH 7 done: loss 0.0181 - lr: 0.000010
2023-10-17 18:11:13,489 DEV : loss 0.112530916929245 - f1-score (micro avg) 0.8757
2023-10-17 18:11:13,508 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:18,960 epoch 8 - iter 72/723 - loss 0.00982102 - time (sec): 5.45 - samples/sec: 3354.45 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:11:24,206 epoch 8 - iter 144/723 - loss 0.01111392 - time (sec): 10.70 - samples/sec: 3381.94 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:11:29,275 epoch 8 - iter 216/723 - loss 0.01252694 - time (sec): 15.77 - samples/sec: 3393.20 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:11:34,521 epoch 8 - iter 288/723 - loss 0.01206236 - time (sec): 21.01 - samples/sec: 3334.63 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:11:40,082 epoch 8 - iter 360/723 - loss 0.01224851 - time (sec): 26.57 - samples/sec: 3331.83 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:11:45,118 epoch 8 - iter 432/723 - loss 0.01187705 - time (sec): 31.61 - samples/sec: 3354.56 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:11:50,453 epoch 8 - iter 504/723 - loss 0.01236789 - time (sec): 36.94 - samples/sec: 3310.07 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:11:55,804 epoch 8 - iter 576/723 - loss 0.01200156 - time (sec): 42.30 - samples/sec: 3306.51 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:12:01,328 epoch 8 - iter 648/723 - loss 0.01252768 - time (sec): 47.82 - samples/sec: 3309.92 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:12:06,499 epoch 8 - iter 720/723 - loss 0.01258916 - time (sec): 52.99 - samples/sec: 3318.22 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:12:06,648 ----------------------------------------------------------------------------------------------------
2023-10-17 18:12:06,648 EPOCH 8 done: loss 0.0126 - lr: 0.000007
2023-10-17 18:12:10,019 DEV : loss 0.13107195496559143 - f1-score (micro avg) 0.8717
2023-10-17 18:12:10,042 ----------------------------------------------------------------------------------------------------
2023-10-17 18:12:15,640 epoch 9 - iter 72/723 - loss 0.00761981 - time (sec): 5.60 - samples/sec: 3132.39 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:12:20,962 epoch 9 - iter 144/723 - loss 0.00858213 - time (sec): 10.92 - samples/sec: 3216.86 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:12:26,242 epoch 9 - iter 216/723 - loss 0.00853157 - time (sec): 16.20 - samples/sec: 3295.29 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:12:31,518 epoch 9 - iter 288/723 - loss 0.00876791 - time (sec): 21.47 - samples/sec: 3321.23 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:12:36,672 epoch 9 - iter 360/723 - loss 0.00859438 - time (sec): 26.63 - samples/sec: 3294.91 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:12:41,859 epoch 9 - iter 432/723 - loss 0.00864313 - time (sec): 31.82 - samples/sec: 3318.31 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:12:47,352 epoch 9 - iter 504/723 - loss 0.00850276 - time (sec): 37.31 - samples/sec: 3303.20 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:12:52,892 epoch 9 - iter 576/723 - loss 0.00950137 - time (sec): 42.85 - samples/sec: 3299.80 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:12:57,736 epoch 9 - iter 648/723 - loss 0.00908151 - time (sec): 47.69 - samples/sec: 3312.61 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:13:03,216 epoch 9 - iter 720/723 - loss 0.01015906 - time (sec): 53.17 - samples/sec: 3303.07 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:13:03,433 ----------------------------------------------------------------------------------------------------
2023-10-17 18:13:03,433 EPOCH 9 done: loss 0.0101 - lr: 0.000003
2023-10-17 18:13:07,279 DEV : loss 0.13919697701931 - f1-score (micro avg) 0.877
2023-10-17 18:13:07,296 ----------------------------------------------------------------------------------------------------
2023-10-17 18:13:12,754 epoch 10 - iter 72/723 - loss 0.01328488 - time (sec): 5.46 - samples/sec: 3397.73 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:13:18,228 epoch 10 - iter 144/723 - loss 0.00911430 - time (sec): 10.93 - samples/sec: 3247.20 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:13:23,492 epoch 10 - iter 216/723 - loss 0.00839632 - time (sec): 16.19 - samples/sec: 3258.25 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:13:28,454 epoch 10 - iter 288/723 - loss 0.00791484 - time (sec): 21.16 - samples/sec: 3301.29 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:13:34,083 epoch 10 - iter 360/723 - loss 0.00921151 - time (sec): 26.79 - samples/sec: 3299.18 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:13:39,827 epoch 10 - iter 432/723 - loss 0.00866624 - time (sec): 32.53 - samples/sec: 3286.87 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:13:44,998 epoch 10 - iter 504/723 - loss 0.00879169 - time (sec): 37.70 - samples/sec: 3293.71 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:13:49,760 epoch 10 - iter 576/723 - loss 0.00813104 - time (sec): 42.46 - samples/sec: 3326.77 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:13:55,140 epoch 10 - iter 648/723 - loss 0.00845184 - time (sec): 47.84 - samples/sec: 3334.43 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:14:00,144 epoch 10 - iter 720/723 - loss 0.00785129 - time (sec): 52.85 - samples/sec: 3320.66 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:14:00,361 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:00,361 EPOCH 10 done: loss 0.0078 - lr: 0.000000
2023-10-17 18:14:03,744 DEV : loss 0.13295623660087585 - f1-score (micro avg) 0.8843
2023-10-17 18:14:03,761 saving best model
2023-10-17 18:14:04,705 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:04,706 Loading model from best epoch ...
2023-10-17 18:14:06,078 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 18:14:09,142
Results:
- F-score (micro) 0.8753
- F-score (macro) 0.7737
- Accuracy 0.7867
By class:
precision recall f1-score support
PER 0.8809 0.8589 0.8697 482
LOC 0.9342 0.9301 0.9322 458
ORG 0.5484 0.4928 0.5191 69
micro avg 0.8846 0.8662 0.8753 1009
macro avg 0.7878 0.7606 0.7737 1009
weighted avg 0.8823 0.8662 0.8741 1009
2023-10-17 18:14:09,142 ----------------------------------------------------------------------------------------------------