stefan-it's picture
Upload folder using huggingface_hub
1739a32
2023-10-13 10:35:15,063 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:15,064 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 10:35:15,064 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:15,065 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-13 10:35:15,065 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:15,065 Train: 966 sentences
2023-10-13 10:35:15,065 (train_with_dev=False, train_with_test=False)
2023-10-13 10:35:15,065 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:15,065 Training Params:
2023-10-13 10:35:15,065 - learning_rate: "3e-05"
2023-10-13 10:35:15,065 - mini_batch_size: "8"
2023-10-13 10:35:15,065 - max_epochs: "10"
2023-10-13 10:35:15,065 - shuffle: "True"
2023-10-13 10:35:15,065 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:15,065 Plugins:
2023-10-13 10:35:15,065 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 10:35:15,065 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:15,065 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 10:35:15,065 - metric: "('micro avg', 'f1-score')"
2023-10-13 10:35:15,065 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:15,065 Computation:
2023-10-13 10:35:15,065 - compute on device: cuda:0
2023-10-13 10:35:15,065 - embedding storage: none
2023-10-13 10:35:15,065 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:15,065 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-13 10:35:15,065 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:15,065 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:15,804 epoch 1 - iter 12/121 - loss 3.40551739 - time (sec): 0.74 - samples/sec: 3161.01 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:35:16,647 epoch 1 - iter 24/121 - loss 3.24233415 - time (sec): 1.58 - samples/sec: 3172.97 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:35:17,390 epoch 1 - iter 36/121 - loss 2.90109831 - time (sec): 2.32 - samples/sec: 3294.00 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:35:18,133 epoch 1 - iter 48/121 - loss 2.45822960 - time (sec): 3.07 - samples/sec: 3281.16 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:35:18,826 epoch 1 - iter 60/121 - loss 2.15770353 - time (sec): 3.76 - samples/sec: 3250.61 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:35:19,639 epoch 1 - iter 72/121 - loss 1.88769949 - time (sec): 4.57 - samples/sec: 3251.16 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:35:20,397 epoch 1 - iter 84/121 - loss 1.70007349 - time (sec): 5.33 - samples/sec: 3245.03 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:35:21,190 epoch 1 - iter 96/121 - loss 1.54737351 - time (sec): 6.12 - samples/sec: 3249.09 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:35:21,905 epoch 1 - iter 108/121 - loss 1.43050638 - time (sec): 6.84 - samples/sec: 3244.75 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:35:22,679 epoch 1 - iter 120/121 - loss 1.33234822 - time (sec): 7.61 - samples/sec: 3225.30 - lr: 0.000030 - momentum: 0.000000
2023-10-13 10:35:22,735 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:22,735 EPOCH 1 done: loss 1.3249 - lr: 0.000030
2023-10-13 10:35:23,668 DEV : loss 0.3902263939380646 - f1-score (micro avg) 0.3361
2023-10-13 10:35:23,685 saving best model
2023-10-13 10:35:24,039 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:24,724 epoch 2 - iter 12/121 - loss 0.35517514 - time (sec): 0.68 - samples/sec: 3566.46 - lr: 0.000030 - momentum: 0.000000
2023-10-13 10:35:25,409 epoch 2 - iter 24/121 - loss 0.39241062 - time (sec): 1.37 - samples/sec: 3483.56 - lr: 0.000029 - momentum: 0.000000
2023-10-13 10:35:26,139 epoch 2 - iter 36/121 - loss 0.35933800 - time (sec): 2.10 - samples/sec: 3517.82 - lr: 0.000029 - momentum: 0.000000
2023-10-13 10:35:26,921 epoch 2 - iter 48/121 - loss 0.33946902 - time (sec): 2.88 - samples/sec: 3451.44 - lr: 0.000029 - momentum: 0.000000
2023-10-13 10:35:27,627 epoch 2 - iter 60/121 - loss 0.33320072 - time (sec): 3.59 - samples/sec: 3438.02 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:35:28,299 epoch 2 - iter 72/121 - loss 0.32200630 - time (sec): 4.26 - samples/sec: 3412.52 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:35:29,047 epoch 2 - iter 84/121 - loss 0.30624154 - time (sec): 5.01 - samples/sec: 3404.85 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:35:29,839 epoch 2 - iter 96/121 - loss 0.30126156 - time (sec): 5.80 - samples/sec: 3383.72 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:35:30,676 epoch 2 - iter 108/121 - loss 0.28987193 - time (sec): 6.64 - samples/sec: 3390.51 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:35:31,416 epoch 2 - iter 120/121 - loss 0.28445523 - time (sec): 7.38 - samples/sec: 3338.93 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:35:31,462 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:31,462 EPOCH 2 done: loss 0.2837 - lr: 0.000027
2023-10-13 10:35:32,233 DEV : loss 0.19111163914203644 - f1-score (micro avg) 0.5623
2023-10-13 10:35:32,238 saving best model
2023-10-13 10:35:32,686 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:33,377 epoch 3 - iter 12/121 - loss 0.27745640 - time (sec): 0.69 - samples/sec: 3551.48 - lr: 0.000026 - momentum: 0.000000
2023-10-13 10:35:34,116 epoch 3 - iter 24/121 - loss 0.22729704 - time (sec): 1.43 - samples/sec: 3492.68 - lr: 0.000026 - momentum: 0.000000
2023-10-13 10:35:34,881 epoch 3 - iter 36/121 - loss 0.19706936 - time (sec): 2.19 - samples/sec: 3372.88 - lr: 0.000026 - momentum: 0.000000
2023-10-13 10:35:35,590 epoch 3 - iter 48/121 - loss 0.18757581 - time (sec): 2.90 - samples/sec: 3348.38 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:35:36,271 epoch 3 - iter 60/121 - loss 0.17809100 - time (sec): 3.58 - samples/sec: 3402.55 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:35:37,045 epoch 3 - iter 72/121 - loss 0.16626352 - time (sec): 4.36 - samples/sec: 3368.39 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:35:37,764 epoch 3 - iter 84/121 - loss 0.16990025 - time (sec): 5.08 - samples/sec: 3368.12 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:35:38,530 epoch 3 - iter 96/121 - loss 0.16155749 - time (sec): 5.84 - samples/sec: 3369.10 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:35:39,295 epoch 3 - iter 108/121 - loss 0.15883311 - time (sec): 6.61 - samples/sec: 3355.35 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:35:40,051 epoch 3 - iter 120/121 - loss 0.15761228 - time (sec): 7.36 - samples/sec: 3339.73 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:35:40,109 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:40,109 EPOCH 3 done: loss 0.1579 - lr: 0.000023
2023-10-13 10:35:40,972 DEV : loss 0.1423681527376175 - f1-score (micro avg) 0.7642
2023-10-13 10:35:40,979 saving best model
2023-10-13 10:35:41,511 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:42,292 epoch 4 - iter 12/121 - loss 0.13429458 - time (sec): 0.78 - samples/sec: 3242.18 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:35:43,078 epoch 4 - iter 24/121 - loss 0.11189979 - time (sec): 1.57 - samples/sec: 3298.00 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:35:43,815 epoch 4 - iter 36/121 - loss 0.10317212 - time (sec): 2.30 - samples/sec: 3328.90 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:35:44,491 epoch 4 - iter 48/121 - loss 0.10599027 - time (sec): 2.98 - samples/sec: 3279.26 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:35:45,299 epoch 4 - iter 60/121 - loss 0.10056743 - time (sec): 3.79 - samples/sec: 3310.15 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:35:46,033 epoch 4 - iter 72/121 - loss 0.10062887 - time (sec): 4.52 - samples/sec: 3302.12 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:35:46,795 epoch 4 - iter 84/121 - loss 0.10202049 - time (sec): 5.28 - samples/sec: 3288.87 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:35:47,603 epoch 4 - iter 96/121 - loss 0.10242463 - time (sec): 6.09 - samples/sec: 3240.72 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:35:48,294 epoch 4 - iter 108/121 - loss 0.09987543 - time (sec): 6.78 - samples/sec: 3245.30 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:35:49,059 epoch 4 - iter 120/121 - loss 0.10075786 - time (sec): 7.55 - samples/sec: 3253.37 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:35:49,113 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:49,114 EPOCH 4 done: loss 0.1009 - lr: 0.000020
2023-10-13 10:35:49,900 DEV : loss 0.1301473081111908 - f1-score (micro avg) 0.8228
2023-10-13 10:35:49,905 saving best model
2023-10-13 10:35:50,375 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:51,168 epoch 5 - iter 12/121 - loss 0.07335668 - time (sec): 0.79 - samples/sec: 3202.93 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:35:51,946 epoch 5 - iter 24/121 - loss 0.08423870 - time (sec): 1.57 - samples/sec: 3071.21 - lr: 0.000019 - momentum: 0.000000
2023-10-13 10:35:52,729 epoch 5 - iter 36/121 - loss 0.08309725 - time (sec): 2.35 - samples/sec: 3106.08 - lr: 0.000019 - momentum: 0.000000
2023-10-13 10:35:53,465 epoch 5 - iter 48/121 - loss 0.07779597 - time (sec): 3.09 - samples/sec: 3120.45 - lr: 0.000019 - momentum: 0.000000
2023-10-13 10:35:54,336 epoch 5 - iter 60/121 - loss 0.07549248 - time (sec): 3.96 - samples/sec: 3121.73 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:35:55,110 epoch 5 - iter 72/121 - loss 0.07379139 - time (sec): 4.73 - samples/sec: 3189.45 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:35:55,821 epoch 5 - iter 84/121 - loss 0.07264610 - time (sec): 5.44 - samples/sec: 3244.66 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:35:56,585 epoch 5 - iter 96/121 - loss 0.07123248 - time (sec): 6.21 - samples/sec: 3218.07 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:35:57,256 epoch 5 - iter 108/121 - loss 0.06946222 - time (sec): 6.88 - samples/sec: 3200.51 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:35:58,005 epoch 5 - iter 120/121 - loss 0.07281665 - time (sec): 7.63 - samples/sec: 3213.62 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:35:58,071 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:58,071 EPOCH 5 done: loss 0.0733 - lr: 0.000017
2023-10-13 10:35:58,924 DEV : loss 0.1545332670211792 - f1-score (micro avg) 0.7679
2023-10-13 10:35:58,930 ----------------------------------------------------------------------------------------------------
2023-10-13 10:35:59,701 epoch 6 - iter 12/121 - loss 0.05131288 - time (sec): 0.77 - samples/sec: 3310.35 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:36:00,421 epoch 6 - iter 24/121 - loss 0.06035090 - time (sec): 1.49 - samples/sec: 3132.48 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:36:01,130 epoch 6 - iter 36/121 - loss 0.05974828 - time (sec): 2.20 - samples/sec: 3198.46 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:36:01,870 epoch 6 - iter 48/121 - loss 0.05659050 - time (sec): 2.94 - samples/sec: 3181.53 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:36:02,691 epoch 6 - iter 60/121 - loss 0.05427630 - time (sec): 3.76 - samples/sec: 3204.98 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:36:03,434 epoch 6 - iter 72/121 - loss 0.05221125 - time (sec): 4.50 - samples/sec: 3227.51 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:36:04,236 epoch 6 - iter 84/121 - loss 0.05017516 - time (sec): 5.30 - samples/sec: 3244.65 - lr: 0.000014 - momentum: 0.000000
2023-10-13 10:36:05,010 epoch 6 - iter 96/121 - loss 0.04854167 - time (sec): 6.08 - samples/sec: 3281.01 - lr: 0.000014 - momentum: 0.000000
2023-10-13 10:36:05,803 epoch 6 - iter 108/121 - loss 0.05071127 - time (sec): 6.87 - samples/sec: 3269.30 - lr: 0.000014 - momentum: 0.000000
2023-10-13 10:36:06,579 epoch 6 - iter 120/121 - loss 0.05091063 - time (sec): 7.65 - samples/sec: 3219.30 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:36:06,634 ----------------------------------------------------------------------------------------------------
2023-10-13 10:36:06,634 EPOCH 6 done: loss 0.0509 - lr: 0.000013
2023-10-13 10:36:07,462 DEV : loss 0.1363101452589035 - f1-score (micro avg) 0.807
2023-10-13 10:36:07,468 ----------------------------------------------------------------------------------------------------
2023-10-13 10:36:08,222 epoch 7 - iter 12/121 - loss 0.04505925 - time (sec): 0.75 - samples/sec: 2936.03 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:36:09,021 epoch 7 - iter 24/121 - loss 0.02960900 - time (sec): 1.55 - samples/sec: 2974.90 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:36:09,845 epoch 7 - iter 36/121 - loss 0.03010326 - time (sec): 2.38 - samples/sec: 3023.85 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:36:10,577 epoch 7 - iter 48/121 - loss 0.03600332 - time (sec): 3.11 - samples/sec: 3086.87 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:36:11,342 epoch 7 - iter 60/121 - loss 0.04012664 - time (sec): 3.87 - samples/sec: 3112.32 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:36:12,147 epoch 7 - iter 72/121 - loss 0.04193263 - time (sec): 4.68 - samples/sec: 3136.32 - lr: 0.000011 - momentum: 0.000000
2023-10-13 10:36:13,005 epoch 7 - iter 84/121 - loss 0.04217293 - time (sec): 5.54 - samples/sec: 3129.39 - lr: 0.000011 - momentum: 0.000000
2023-10-13 10:36:13,811 epoch 7 - iter 96/121 - loss 0.03917201 - time (sec): 6.34 - samples/sec: 3122.88 - lr: 0.000011 - momentum: 0.000000
2023-10-13 10:36:14,538 epoch 7 - iter 108/121 - loss 0.03778633 - time (sec): 7.07 - samples/sec: 3117.18 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:36:15,378 epoch 7 - iter 120/121 - loss 0.03828768 - time (sec): 7.91 - samples/sec: 3119.85 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:36:15,435 ----------------------------------------------------------------------------------------------------
2023-10-13 10:36:15,435 EPOCH 7 done: loss 0.0384 - lr: 0.000010
2023-10-13 10:36:16,482 DEV : loss 0.15187180042266846 - f1-score (micro avg) 0.813
2023-10-13 10:36:16,489 ----------------------------------------------------------------------------------------------------
2023-10-13 10:36:17,254 epoch 8 - iter 12/121 - loss 0.02643732 - time (sec): 0.76 - samples/sec: 3190.30 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:36:18,041 epoch 8 - iter 24/121 - loss 0.02551022 - time (sec): 1.55 - samples/sec: 2970.26 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:36:18,749 epoch 8 - iter 36/121 - loss 0.02724930 - time (sec): 2.26 - samples/sec: 3127.62 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:36:19,449 epoch 8 - iter 48/121 - loss 0.02456721 - time (sec): 2.96 - samples/sec: 3143.26 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:36:20,274 epoch 8 - iter 60/121 - loss 0.02288726 - time (sec): 3.78 - samples/sec: 3214.62 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:36:21,006 epoch 8 - iter 72/121 - loss 0.02272784 - time (sec): 4.52 - samples/sec: 3240.78 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:36:21,765 epoch 8 - iter 84/121 - loss 0.02274187 - time (sec): 5.27 - samples/sec: 3212.31 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:36:22,505 epoch 8 - iter 96/121 - loss 0.02611446 - time (sec): 6.01 - samples/sec: 3252.19 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:36:23,232 epoch 8 - iter 108/121 - loss 0.02989065 - time (sec): 6.74 - samples/sec: 3278.95 - lr: 0.000007 - momentum: 0.000000
2023-10-13 10:36:23,966 epoch 8 - iter 120/121 - loss 0.03088350 - time (sec): 7.48 - samples/sec: 3291.48 - lr: 0.000007 - momentum: 0.000000
2023-10-13 10:36:24,018 ----------------------------------------------------------------------------------------------------
2023-10-13 10:36:24,018 EPOCH 8 done: loss 0.0307 - lr: 0.000007
2023-10-13 10:36:24,872 DEV : loss 0.1658356934785843 - f1-score (micro avg) 0.809
2023-10-13 10:36:24,886 ----------------------------------------------------------------------------------------------------
2023-10-13 10:36:25,667 epoch 9 - iter 12/121 - loss 0.01903242 - time (sec): 0.78 - samples/sec: 3483.76 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:36:26,462 epoch 9 - iter 24/121 - loss 0.01913730 - time (sec): 1.57 - samples/sec: 3356.17 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:36:27,242 epoch 9 - iter 36/121 - loss 0.02075890 - time (sec): 2.35 - samples/sec: 3157.09 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:36:27,987 epoch 9 - iter 48/121 - loss 0.02383192 - time (sec): 3.10 - samples/sec: 3218.09 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:36:28,734 epoch 9 - iter 60/121 - loss 0.02159027 - time (sec): 3.85 - samples/sec: 3248.41 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:36:29,502 epoch 9 - iter 72/121 - loss 0.02614018 - time (sec): 4.61 - samples/sec: 3306.57 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:36:30,240 epoch 9 - iter 84/121 - loss 0.02421214 - time (sec): 5.35 - samples/sec: 3296.45 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:36:31,004 epoch 9 - iter 96/121 - loss 0.02618009 - time (sec): 6.12 - samples/sec: 3249.41 - lr: 0.000004 - momentum: 0.000000
2023-10-13 10:36:31,801 epoch 9 - iter 108/121 - loss 0.02475726 - time (sec): 6.91 - samples/sec: 3256.98 - lr: 0.000004 - momentum: 0.000000
2023-10-13 10:36:32,507 epoch 9 - iter 120/121 - loss 0.02333825 - time (sec): 7.62 - samples/sec: 3232.52 - lr: 0.000004 - momentum: 0.000000
2023-10-13 10:36:32,558 ----------------------------------------------------------------------------------------------------
2023-10-13 10:36:32,558 EPOCH 9 done: loss 0.0233 - lr: 0.000004
2023-10-13 10:36:33,440 DEV : loss 0.17156408727169037 - f1-score (micro avg) 0.815
2023-10-13 10:36:33,445 ----------------------------------------------------------------------------------------------------
2023-10-13 10:36:34,169 epoch 10 - iter 12/121 - loss 0.02998636 - time (sec): 0.72 - samples/sec: 3128.29 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:36:34,946 epoch 10 - iter 24/121 - loss 0.02844073 - time (sec): 1.50 - samples/sec: 3129.10 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:36:35,739 epoch 10 - iter 36/121 - loss 0.02841094 - time (sec): 2.29 - samples/sec: 3174.92 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:36:36,523 epoch 10 - iter 48/121 - loss 0.02630566 - time (sec): 3.08 - samples/sec: 3218.87 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:36:37,266 epoch 10 - iter 60/121 - loss 0.02357025 - time (sec): 3.82 - samples/sec: 3250.13 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:36:38,010 epoch 10 - iter 72/121 - loss 0.02364052 - time (sec): 4.56 - samples/sec: 3197.12 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:36:38,754 epoch 10 - iter 84/121 - loss 0.02360889 - time (sec): 5.31 - samples/sec: 3175.06 - lr: 0.000001 - momentum: 0.000000
2023-10-13 10:36:39,505 epoch 10 - iter 96/121 - loss 0.02308161 - time (sec): 6.06 - samples/sec: 3156.47 - lr: 0.000001 - momentum: 0.000000
2023-10-13 10:36:40,265 epoch 10 - iter 108/121 - loss 0.02133307 - time (sec): 6.82 - samples/sec: 3191.46 - lr: 0.000001 - momentum: 0.000000
2023-10-13 10:36:41,139 epoch 10 - iter 120/121 - loss 0.01990030 - time (sec): 7.69 - samples/sec: 3189.70 - lr: 0.000000 - momentum: 0.000000
2023-10-13 10:36:41,189 ----------------------------------------------------------------------------------------------------
2023-10-13 10:36:41,189 EPOCH 10 done: loss 0.0200 - lr: 0.000000
2023-10-13 10:36:41,960 DEV : loss 0.16963137686252594 - f1-score (micro avg) 0.8156
2023-10-13 10:36:42,308 ----------------------------------------------------------------------------------------------------
2023-10-13 10:36:42,309 Loading model from best epoch ...
2023-10-13 10:36:43,726 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-13 10:36:44,411
Results:
- F-score (micro) 0.7703
- F-score (macro) 0.4595
- Accuracy 0.6537
By class:
precision recall f1-score support
pers 0.7986 0.8273 0.8127 139
scope 0.8028 0.8837 0.8413 129
work 0.5957 0.7000 0.6437 80
loc 0.0000 0.0000 0.0000 9
date 0.0000 0.0000 0.0000 3
micro avg 0.7500 0.7917 0.7703 360
macro avg 0.4394 0.4822 0.4595 360
weighted avg 0.7284 0.7917 0.7583 360
2023-10-13 10:36:44,412 ----------------------------------------------------------------------------------------------------