stefan-it's picture
Upload folder using huggingface_hub
63cbaa7
2023-10-16 19:31:53,657 ----------------------------------------------------------------------------------------------------
2023-10-16 19:31:53,658 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 19:31:53,658 ----------------------------------------------------------------------------------------------------
2023-10-16 19:31:53,658 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-16 19:31:53,658 ----------------------------------------------------------------------------------------------------
2023-10-16 19:31:53,659 Train: 1085 sentences
2023-10-16 19:31:53,659 (train_with_dev=False, train_with_test=False)
2023-10-16 19:31:53,659 ----------------------------------------------------------------------------------------------------
2023-10-16 19:31:53,659 Training Params:
2023-10-16 19:31:53,659 - learning_rate: "5e-05"
2023-10-16 19:31:53,659 - mini_batch_size: "4"
2023-10-16 19:31:53,659 - max_epochs: "10"
2023-10-16 19:31:53,659 - shuffle: "True"
2023-10-16 19:31:53,659 ----------------------------------------------------------------------------------------------------
2023-10-16 19:31:53,659 Plugins:
2023-10-16 19:31:53,659 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 19:31:53,659 ----------------------------------------------------------------------------------------------------
2023-10-16 19:31:53,659 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 19:31:53,659 - metric: "('micro avg', 'f1-score')"
2023-10-16 19:31:53,659 ----------------------------------------------------------------------------------------------------
2023-10-16 19:31:53,659 Computation:
2023-10-16 19:31:53,659 - compute on device: cuda:0
2023-10-16 19:31:53,659 - embedding storage: none
2023-10-16 19:31:53,659 ----------------------------------------------------------------------------------------------------
2023-10-16 19:31:53,659 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-16 19:31:53,659 ----------------------------------------------------------------------------------------------------
2023-10-16 19:31:53,659 ----------------------------------------------------------------------------------------------------
2023-10-16 19:31:55,223 epoch 1 - iter 27/272 - loss 2.95328015 - time (sec): 1.56 - samples/sec: 3068.64 - lr: 0.000005 - momentum: 0.000000
2023-10-16 19:31:56,836 epoch 1 - iter 54/272 - loss 2.16693759 - time (sec): 3.18 - samples/sec: 3231.27 - lr: 0.000010 - momentum: 0.000000
2023-10-16 19:31:58,277 epoch 1 - iter 81/272 - loss 1.65058382 - time (sec): 4.62 - samples/sec: 3248.92 - lr: 0.000015 - momentum: 0.000000
2023-10-16 19:31:59,693 epoch 1 - iter 108/272 - loss 1.41300806 - time (sec): 6.03 - samples/sec: 3246.96 - lr: 0.000020 - momentum: 0.000000
2023-10-16 19:32:01,227 epoch 1 - iter 135/272 - loss 1.16296640 - time (sec): 7.57 - samples/sec: 3364.72 - lr: 0.000025 - momentum: 0.000000
2023-10-16 19:32:02,735 epoch 1 - iter 162/272 - loss 1.02940910 - time (sec): 9.07 - samples/sec: 3352.61 - lr: 0.000030 - momentum: 0.000000
2023-10-16 19:32:04,318 epoch 1 - iter 189/272 - loss 0.91107036 - time (sec): 10.66 - samples/sec: 3348.23 - lr: 0.000035 - momentum: 0.000000
2023-10-16 19:32:05,793 epoch 1 - iter 216/272 - loss 0.83286521 - time (sec): 12.13 - samples/sec: 3351.30 - lr: 0.000040 - momentum: 0.000000
2023-10-16 19:32:07,395 epoch 1 - iter 243/272 - loss 0.76057220 - time (sec): 13.73 - samples/sec: 3355.03 - lr: 0.000044 - momentum: 0.000000
2023-10-16 19:32:08,993 epoch 1 - iter 270/272 - loss 0.69365828 - time (sec): 15.33 - samples/sec: 3371.47 - lr: 0.000049 - momentum: 0.000000
2023-10-16 19:32:09,106 ----------------------------------------------------------------------------------------------------
2023-10-16 19:32:09,106 EPOCH 1 done: loss 0.6924 - lr: 0.000049
2023-10-16 19:32:10,110 DEV : loss 0.19004914164543152 - f1-score (micro avg) 0.5426
2023-10-16 19:32:10,114 saving best model
2023-10-16 19:32:10,471 ----------------------------------------------------------------------------------------------------
2023-10-16 19:32:12,167 epoch 2 - iter 27/272 - loss 0.19414973 - time (sec): 1.69 - samples/sec: 3479.42 - lr: 0.000049 - momentum: 0.000000
2023-10-16 19:32:13,697 epoch 2 - iter 54/272 - loss 0.17106126 - time (sec): 3.22 - samples/sec: 3302.48 - lr: 0.000049 - momentum: 0.000000
2023-10-16 19:32:15,263 epoch 2 - iter 81/272 - loss 0.16459301 - time (sec): 4.79 - samples/sec: 3368.49 - lr: 0.000048 - momentum: 0.000000
2023-10-16 19:32:16,827 epoch 2 - iter 108/272 - loss 0.18139997 - time (sec): 6.35 - samples/sec: 3303.45 - lr: 0.000048 - momentum: 0.000000
2023-10-16 19:32:18,369 epoch 2 - iter 135/272 - loss 0.17164269 - time (sec): 7.90 - samples/sec: 3282.85 - lr: 0.000047 - momentum: 0.000000
2023-10-16 19:32:19,835 epoch 2 - iter 162/272 - loss 0.16439995 - time (sec): 9.36 - samples/sec: 3238.76 - lr: 0.000047 - momentum: 0.000000
2023-10-16 19:32:21,448 epoch 2 - iter 189/272 - loss 0.15959924 - time (sec): 10.98 - samples/sec: 3291.21 - lr: 0.000046 - momentum: 0.000000
2023-10-16 19:32:22,926 epoch 2 - iter 216/272 - loss 0.15587286 - time (sec): 12.45 - samples/sec: 3330.90 - lr: 0.000046 - momentum: 0.000000
2023-10-16 19:32:24,472 epoch 2 - iter 243/272 - loss 0.15185873 - time (sec): 14.00 - samples/sec: 3313.00 - lr: 0.000045 - momentum: 0.000000
2023-10-16 19:32:26,009 epoch 2 - iter 270/272 - loss 0.14816339 - time (sec): 15.54 - samples/sec: 3331.01 - lr: 0.000045 - momentum: 0.000000
2023-10-16 19:32:26,108 ----------------------------------------------------------------------------------------------------
2023-10-16 19:32:26,108 EPOCH 2 done: loss 0.1480 - lr: 0.000045
2023-10-16 19:32:27,518 DEV : loss 0.10820505768060684 - f1-score (micro avg) 0.7463
2023-10-16 19:32:27,522 saving best model
2023-10-16 19:32:27,965 ----------------------------------------------------------------------------------------------------
2023-10-16 19:32:29,500 epoch 3 - iter 27/272 - loss 0.08155325 - time (sec): 1.53 - samples/sec: 3281.37 - lr: 0.000044 - momentum: 0.000000
2023-10-16 19:32:30,876 epoch 3 - iter 54/272 - loss 0.08409408 - time (sec): 2.91 - samples/sec: 3254.26 - lr: 0.000043 - momentum: 0.000000
2023-10-16 19:32:32,578 epoch 3 - iter 81/272 - loss 0.08139739 - time (sec): 4.61 - samples/sec: 3376.79 - lr: 0.000043 - momentum: 0.000000
2023-10-16 19:32:34,076 epoch 3 - iter 108/272 - loss 0.08180166 - time (sec): 6.11 - samples/sec: 3332.59 - lr: 0.000042 - momentum: 0.000000
2023-10-16 19:32:35,533 epoch 3 - iter 135/272 - loss 0.07974796 - time (sec): 7.57 - samples/sec: 3281.81 - lr: 0.000042 - momentum: 0.000000
2023-10-16 19:32:37,161 epoch 3 - iter 162/272 - loss 0.08213381 - time (sec): 9.19 - samples/sec: 3327.65 - lr: 0.000041 - momentum: 0.000000
2023-10-16 19:32:38,724 epoch 3 - iter 189/272 - loss 0.08010112 - time (sec): 10.76 - samples/sec: 3301.80 - lr: 0.000041 - momentum: 0.000000
2023-10-16 19:32:40,302 epoch 3 - iter 216/272 - loss 0.08185408 - time (sec): 12.34 - samples/sec: 3301.59 - lr: 0.000040 - momentum: 0.000000
2023-10-16 19:32:41,794 epoch 3 - iter 243/272 - loss 0.08367674 - time (sec): 13.83 - samples/sec: 3290.53 - lr: 0.000040 - momentum: 0.000000
2023-10-16 19:32:43,430 epoch 3 - iter 270/272 - loss 0.08124787 - time (sec): 15.46 - samples/sec: 3336.37 - lr: 0.000039 - momentum: 0.000000
2023-10-16 19:32:43,548 ----------------------------------------------------------------------------------------------------
2023-10-16 19:32:43,548 EPOCH 3 done: loss 0.0809 - lr: 0.000039
2023-10-16 19:32:44,984 DEV : loss 0.10929891467094421 - f1-score (micro avg) 0.7446
2023-10-16 19:32:44,988 ----------------------------------------------------------------------------------------------------
2023-10-16 19:32:46,576 epoch 4 - iter 27/272 - loss 0.05437417 - time (sec): 1.59 - samples/sec: 3533.96 - lr: 0.000038 - momentum: 0.000000
2023-10-16 19:32:48,099 epoch 4 - iter 54/272 - loss 0.04384694 - time (sec): 3.11 - samples/sec: 3554.17 - lr: 0.000038 - momentum: 0.000000
2023-10-16 19:32:49,691 epoch 4 - iter 81/272 - loss 0.04594154 - time (sec): 4.70 - samples/sec: 3477.23 - lr: 0.000037 - momentum: 0.000000
2023-10-16 19:32:51,240 epoch 4 - iter 108/272 - loss 0.05424351 - time (sec): 6.25 - samples/sec: 3424.77 - lr: 0.000037 - momentum: 0.000000
2023-10-16 19:32:52,810 epoch 4 - iter 135/272 - loss 0.04989679 - time (sec): 7.82 - samples/sec: 3412.90 - lr: 0.000036 - momentum: 0.000000
2023-10-16 19:32:54,386 epoch 4 - iter 162/272 - loss 0.04978778 - time (sec): 9.40 - samples/sec: 3389.52 - lr: 0.000036 - momentum: 0.000000
2023-10-16 19:32:56,065 epoch 4 - iter 189/272 - loss 0.05027613 - time (sec): 11.08 - samples/sec: 3392.18 - lr: 0.000035 - momentum: 0.000000
2023-10-16 19:32:57,704 epoch 4 - iter 216/272 - loss 0.05205747 - time (sec): 12.71 - samples/sec: 3307.15 - lr: 0.000034 - momentum: 0.000000
2023-10-16 19:32:59,252 epoch 4 - iter 243/272 - loss 0.05199972 - time (sec): 14.26 - samples/sec: 3315.73 - lr: 0.000034 - momentum: 0.000000
2023-10-16 19:33:00,770 epoch 4 - iter 270/272 - loss 0.05396688 - time (sec): 15.78 - samples/sec: 3289.80 - lr: 0.000033 - momentum: 0.000000
2023-10-16 19:33:00,848 ----------------------------------------------------------------------------------------------------
2023-10-16 19:33:00,849 EPOCH 4 done: loss 0.0539 - lr: 0.000033
2023-10-16 19:33:02,295 DEV : loss 0.12052459269762039 - f1-score (micro avg) 0.8095
2023-10-16 19:33:02,299 saving best model
2023-10-16 19:33:02,747 ----------------------------------------------------------------------------------------------------
2023-10-16 19:33:04,363 epoch 5 - iter 27/272 - loss 0.04257767 - time (sec): 1.61 - samples/sec: 3332.16 - lr: 0.000033 - momentum: 0.000000
2023-10-16 19:33:05,927 epoch 5 - iter 54/272 - loss 0.03104477 - time (sec): 3.17 - samples/sec: 3234.01 - lr: 0.000032 - momentum: 0.000000
2023-10-16 19:33:07,592 epoch 5 - iter 81/272 - loss 0.02769835 - time (sec): 4.84 - samples/sec: 3363.99 - lr: 0.000032 - momentum: 0.000000
2023-10-16 19:33:09,122 epoch 5 - iter 108/272 - loss 0.03682368 - time (sec): 6.37 - samples/sec: 3352.56 - lr: 0.000031 - momentum: 0.000000
2023-10-16 19:33:10,626 epoch 5 - iter 135/272 - loss 0.04182425 - time (sec): 7.87 - samples/sec: 3379.02 - lr: 0.000031 - momentum: 0.000000
2023-10-16 19:33:12,140 epoch 5 - iter 162/272 - loss 0.04003463 - time (sec): 9.38 - samples/sec: 3316.19 - lr: 0.000030 - momentum: 0.000000
2023-10-16 19:33:13,737 epoch 5 - iter 189/272 - loss 0.03933014 - time (sec): 10.98 - samples/sec: 3353.80 - lr: 0.000029 - momentum: 0.000000
2023-10-16 19:33:15,308 epoch 5 - iter 216/272 - loss 0.03933942 - time (sec): 12.55 - samples/sec: 3338.80 - lr: 0.000029 - momentum: 0.000000
2023-10-16 19:33:16,911 epoch 5 - iter 243/272 - loss 0.03741570 - time (sec): 14.16 - samples/sec: 3358.22 - lr: 0.000028 - momentum: 0.000000
2023-10-16 19:33:18,380 epoch 5 - iter 270/272 - loss 0.03884246 - time (sec): 15.62 - samples/sec: 3322.35 - lr: 0.000028 - momentum: 0.000000
2023-10-16 19:33:18,463 ----------------------------------------------------------------------------------------------------
2023-10-16 19:33:18,463 EPOCH 5 done: loss 0.0388 - lr: 0.000028
2023-10-16 19:33:19,898 DEV : loss 0.1621648073196411 - f1-score (micro avg) 0.7891
2023-10-16 19:33:19,901 ----------------------------------------------------------------------------------------------------
2023-10-16 19:33:21,401 epoch 6 - iter 27/272 - loss 0.01912861 - time (sec): 1.50 - samples/sec: 3262.46 - lr: 0.000027 - momentum: 0.000000
2023-10-16 19:33:22,800 epoch 6 - iter 54/272 - loss 0.02221352 - time (sec): 2.90 - samples/sec: 3284.19 - lr: 0.000027 - momentum: 0.000000
2023-10-16 19:33:24,348 epoch 6 - iter 81/272 - loss 0.02732011 - time (sec): 4.45 - samples/sec: 3290.85 - lr: 0.000026 - momentum: 0.000000
2023-10-16 19:33:26,088 epoch 6 - iter 108/272 - loss 0.02665238 - time (sec): 6.19 - samples/sec: 3331.01 - lr: 0.000026 - momentum: 0.000000
2023-10-16 19:33:27,525 epoch 6 - iter 135/272 - loss 0.03098490 - time (sec): 7.62 - samples/sec: 3374.30 - lr: 0.000025 - momentum: 0.000000
2023-10-16 19:33:29,079 epoch 6 - iter 162/272 - loss 0.03126089 - time (sec): 9.18 - samples/sec: 3286.18 - lr: 0.000024 - momentum: 0.000000
2023-10-16 19:33:30,706 epoch 6 - iter 189/272 - loss 0.03235828 - time (sec): 10.80 - samples/sec: 3258.39 - lr: 0.000024 - momentum: 0.000000
2023-10-16 19:33:32,321 epoch 6 - iter 216/272 - loss 0.03110078 - time (sec): 12.42 - samples/sec: 3283.92 - lr: 0.000023 - momentum: 0.000000
2023-10-16 19:33:33,890 epoch 6 - iter 243/272 - loss 0.02991518 - time (sec): 13.99 - samples/sec: 3270.91 - lr: 0.000023 - momentum: 0.000000
2023-10-16 19:33:35,585 epoch 6 - iter 270/272 - loss 0.02933351 - time (sec): 15.68 - samples/sec: 3309.80 - lr: 0.000022 - momentum: 0.000000
2023-10-16 19:33:35,675 ----------------------------------------------------------------------------------------------------
2023-10-16 19:33:35,675 EPOCH 6 done: loss 0.0293 - lr: 0.000022
2023-10-16 19:33:37,090 DEV : loss 0.14851412177085876 - f1-score (micro avg) 0.8433
2023-10-16 19:33:37,094 saving best model
2023-10-16 19:33:37,537 ----------------------------------------------------------------------------------------------------
2023-10-16 19:33:39,088 epoch 7 - iter 27/272 - loss 0.03104110 - time (sec): 1.55 - samples/sec: 3804.25 - lr: 0.000022 - momentum: 0.000000
2023-10-16 19:33:40,630 epoch 7 - iter 54/272 - loss 0.02299984 - time (sec): 3.09 - samples/sec: 3494.92 - lr: 0.000021 - momentum: 0.000000
2023-10-16 19:33:42,130 epoch 7 - iter 81/272 - loss 0.02190619 - time (sec): 4.59 - samples/sec: 3523.84 - lr: 0.000021 - momentum: 0.000000
2023-10-16 19:33:43,696 epoch 7 - iter 108/272 - loss 0.02242868 - time (sec): 6.16 - samples/sec: 3465.79 - lr: 0.000020 - momentum: 0.000000
2023-10-16 19:33:45,260 epoch 7 - iter 135/272 - loss 0.02177196 - time (sec): 7.72 - samples/sec: 3405.58 - lr: 0.000019 - momentum: 0.000000
2023-10-16 19:33:46,851 epoch 7 - iter 162/272 - loss 0.02050069 - time (sec): 9.31 - samples/sec: 3408.90 - lr: 0.000019 - momentum: 0.000000
2023-10-16 19:33:48,584 epoch 7 - iter 189/272 - loss 0.01812006 - time (sec): 11.04 - samples/sec: 3382.28 - lr: 0.000018 - momentum: 0.000000
2023-10-16 19:33:50,044 epoch 7 - iter 216/272 - loss 0.01834730 - time (sec): 12.50 - samples/sec: 3345.05 - lr: 0.000018 - momentum: 0.000000
2023-10-16 19:33:51,565 epoch 7 - iter 243/272 - loss 0.02112046 - time (sec): 14.02 - samples/sec: 3358.62 - lr: 0.000017 - momentum: 0.000000
2023-10-16 19:33:53,115 epoch 7 - iter 270/272 - loss 0.02135027 - time (sec): 15.57 - samples/sec: 3319.75 - lr: 0.000017 - momentum: 0.000000
2023-10-16 19:33:53,200 ----------------------------------------------------------------------------------------------------
2023-10-16 19:33:53,200 EPOCH 7 done: loss 0.0212 - lr: 0.000017
2023-10-16 19:33:54,810 DEV : loss 0.16008441150188446 - f1-score (micro avg) 0.8207
2023-10-16 19:33:54,814 ----------------------------------------------------------------------------------------------------
2023-10-16 19:33:56,499 epoch 8 - iter 27/272 - loss 0.01358068 - time (sec): 1.68 - samples/sec: 3467.90 - lr: 0.000016 - momentum: 0.000000
2023-10-16 19:33:58,092 epoch 8 - iter 54/272 - loss 0.01355086 - time (sec): 3.28 - samples/sec: 3411.86 - lr: 0.000016 - momentum: 0.000000
2023-10-16 19:33:59,532 epoch 8 - iter 81/272 - loss 0.01197700 - time (sec): 4.72 - samples/sec: 3422.20 - lr: 0.000015 - momentum: 0.000000
2023-10-16 19:34:01,011 epoch 8 - iter 108/272 - loss 0.01122890 - time (sec): 6.20 - samples/sec: 3386.16 - lr: 0.000014 - momentum: 0.000000
2023-10-16 19:34:02,554 epoch 8 - iter 135/272 - loss 0.01385527 - time (sec): 7.74 - samples/sec: 3427.09 - lr: 0.000014 - momentum: 0.000000
2023-10-16 19:34:04,081 epoch 8 - iter 162/272 - loss 0.01323604 - time (sec): 9.27 - samples/sec: 3411.87 - lr: 0.000013 - momentum: 0.000000
2023-10-16 19:34:05,881 epoch 8 - iter 189/272 - loss 0.01363279 - time (sec): 11.07 - samples/sec: 3430.21 - lr: 0.000013 - momentum: 0.000000
2023-10-16 19:34:07,477 epoch 8 - iter 216/272 - loss 0.01337961 - time (sec): 12.66 - samples/sec: 3382.58 - lr: 0.000012 - momentum: 0.000000
2023-10-16 19:34:08,939 epoch 8 - iter 243/272 - loss 0.01371133 - time (sec): 14.12 - samples/sec: 3351.16 - lr: 0.000012 - momentum: 0.000000
2023-10-16 19:34:10,433 epoch 8 - iter 270/272 - loss 0.01409468 - time (sec): 15.62 - samples/sec: 3321.67 - lr: 0.000011 - momentum: 0.000000
2023-10-16 19:34:10,520 ----------------------------------------------------------------------------------------------------
2023-10-16 19:34:10,521 EPOCH 8 done: loss 0.0141 - lr: 0.000011
2023-10-16 19:34:11,958 DEV : loss 0.17060722410678864 - f1-score (micro avg) 0.8088
2023-10-16 19:34:11,962 ----------------------------------------------------------------------------------------------------
2023-10-16 19:34:13,452 epoch 9 - iter 27/272 - loss 0.02253323 - time (sec): 1.49 - samples/sec: 3402.69 - lr: 0.000011 - momentum: 0.000000
2023-10-16 19:34:15,039 epoch 9 - iter 54/272 - loss 0.02096247 - time (sec): 3.08 - samples/sec: 3492.39 - lr: 0.000010 - momentum: 0.000000
2023-10-16 19:34:16,608 epoch 9 - iter 81/272 - loss 0.01767350 - time (sec): 4.65 - samples/sec: 3550.63 - lr: 0.000009 - momentum: 0.000000
2023-10-16 19:34:18,108 epoch 9 - iter 108/272 - loss 0.01812866 - time (sec): 6.15 - samples/sec: 3553.70 - lr: 0.000009 - momentum: 0.000000
2023-10-16 19:34:19,612 epoch 9 - iter 135/272 - loss 0.01657397 - time (sec): 7.65 - samples/sec: 3446.40 - lr: 0.000008 - momentum: 0.000000
2023-10-16 19:34:21,155 epoch 9 - iter 162/272 - loss 0.01586682 - time (sec): 9.19 - samples/sec: 3350.79 - lr: 0.000008 - momentum: 0.000000
2023-10-16 19:34:22,715 epoch 9 - iter 189/272 - loss 0.01440847 - time (sec): 10.75 - samples/sec: 3383.69 - lr: 0.000007 - momentum: 0.000000
2023-10-16 19:34:24,274 epoch 9 - iter 216/272 - loss 0.01282252 - time (sec): 12.31 - samples/sec: 3381.02 - lr: 0.000007 - momentum: 0.000000
2023-10-16 19:34:25,956 epoch 9 - iter 243/272 - loss 0.01219213 - time (sec): 13.99 - samples/sec: 3364.21 - lr: 0.000006 - momentum: 0.000000
2023-10-16 19:34:27,400 epoch 9 - iter 270/272 - loss 0.01184073 - time (sec): 15.44 - samples/sec: 3354.34 - lr: 0.000006 - momentum: 0.000000
2023-10-16 19:34:27,499 ----------------------------------------------------------------------------------------------------
2023-10-16 19:34:27,499 EPOCH 9 done: loss 0.0118 - lr: 0.000006
2023-10-16 19:34:28,911 DEV : loss 0.17339861392974854 - f1-score (micro avg) 0.7993
2023-10-16 19:34:28,915 ----------------------------------------------------------------------------------------------------
2023-10-16 19:34:30,641 epoch 10 - iter 27/272 - loss 0.00454535 - time (sec): 1.72 - samples/sec: 3068.80 - lr: 0.000005 - momentum: 0.000000
2023-10-16 19:34:32,245 epoch 10 - iter 54/272 - loss 0.00270348 - time (sec): 3.33 - samples/sec: 3229.23 - lr: 0.000004 - momentum: 0.000000
2023-10-16 19:34:33,797 epoch 10 - iter 81/272 - loss 0.00830662 - time (sec): 4.88 - samples/sec: 3230.52 - lr: 0.000004 - momentum: 0.000000
2023-10-16 19:34:35,204 epoch 10 - iter 108/272 - loss 0.00919675 - time (sec): 6.29 - samples/sec: 3246.34 - lr: 0.000003 - momentum: 0.000000
2023-10-16 19:34:36,732 epoch 10 - iter 135/272 - loss 0.00958562 - time (sec): 7.82 - samples/sec: 3213.56 - lr: 0.000003 - momentum: 0.000000
2023-10-16 19:34:38,281 epoch 10 - iter 162/272 - loss 0.00906453 - time (sec): 9.37 - samples/sec: 3227.96 - lr: 0.000002 - momentum: 0.000000
2023-10-16 19:34:39,835 epoch 10 - iter 189/272 - loss 0.00785760 - time (sec): 10.92 - samples/sec: 3247.03 - lr: 0.000002 - momentum: 0.000000
2023-10-16 19:34:41,312 epoch 10 - iter 216/272 - loss 0.00743252 - time (sec): 12.40 - samples/sec: 3274.69 - lr: 0.000001 - momentum: 0.000000
2023-10-16 19:34:42,878 epoch 10 - iter 243/272 - loss 0.00781445 - time (sec): 13.96 - samples/sec: 3353.35 - lr: 0.000001 - momentum: 0.000000
2023-10-16 19:34:44,315 epoch 10 - iter 270/272 - loss 0.00738827 - time (sec): 15.40 - samples/sec: 3365.36 - lr: 0.000000 - momentum: 0.000000
2023-10-16 19:34:44,394 ----------------------------------------------------------------------------------------------------
2023-10-16 19:34:44,394 EPOCH 10 done: loss 0.0074 - lr: 0.000000
2023-10-16 19:34:45,829 DEV : loss 0.17555038630962372 - f1-score (micro avg) 0.811
2023-10-16 19:34:46,196 ----------------------------------------------------------------------------------------------------
2023-10-16 19:34:46,197 Loading model from best epoch ...
2023-10-16 19:34:47,567 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-16 19:34:49,746
Results:
- F-score (micro) 0.7744
- F-score (macro) 0.7533
- Accuracy 0.6521
By class:
precision recall f1-score support
LOC 0.7982 0.8494 0.8230 312
PER 0.6679 0.8606 0.7521 208
ORG 0.5417 0.4727 0.5049 55
HumanProd 0.9130 0.9545 0.9333 22
micro avg 0.7317 0.8224 0.7744 597
macro avg 0.7302 0.7843 0.7533 597
weighted avg 0.7334 0.8224 0.7730 597
2023-10-16 19:34:49,746 ----------------------------------------------------------------------------------------------------