stefan-it's picture
Upload folder using huggingface_hub
befa204
2023-10-13 17:47:04,163 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,164 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,164 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,164 Train: 5901 sentences
2023-10-13 17:47:04,164 (train_with_dev=False, train_with_test=False)
2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,164 Training Params:
2023-10-13 17:47:04,164 - learning_rate: "5e-05"
2023-10-13 17:47:04,164 - mini_batch_size: "8"
2023-10-13 17:47:04,164 - max_epochs: "10"
2023-10-13 17:47:04,164 - shuffle: "True"
2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,164 Plugins:
2023-10-13 17:47:04,164 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,164 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 17:47:04,164 - metric: "('micro avg', 'f1-score')"
2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,164 Computation:
2023-10-13 17:47:04,164 - compute on device: cuda:0
2023-10-13 17:47:04,164 - embedding storage: none
2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,164 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,165 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:09,303 epoch 1 - iter 73/738 - loss 2.61704957 - time (sec): 5.14 - samples/sec: 3329.07 - lr: 0.000005 - momentum: 0.000000
2023-10-13 17:47:14,890 epoch 1 - iter 146/738 - loss 1.63900339 - time (sec): 10.72 - samples/sec: 3355.14 - lr: 0.000010 - momentum: 0.000000
2023-10-13 17:47:19,561 epoch 1 - iter 219/738 - loss 1.25224248 - time (sec): 15.40 - samples/sec: 3383.08 - lr: 0.000015 - momentum: 0.000000
2023-10-13 17:47:24,159 epoch 1 - iter 292/738 - loss 1.03925094 - time (sec): 19.99 - samples/sec: 3392.19 - lr: 0.000020 - momentum: 0.000000
2023-10-13 17:47:28,728 epoch 1 - iter 365/738 - loss 0.89541308 - time (sec): 24.56 - samples/sec: 3399.05 - lr: 0.000025 - momentum: 0.000000
2023-10-13 17:47:33,630 epoch 1 - iter 438/738 - loss 0.79056877 - time (sec): 29.46 - samples/sec: 3393.34 - lr: 0.000030 - momentum: 0.000000
2023-10-13 17:47:37,911 epoch 1 - iter 511/738 - loss 0.71979193 - time (sec): 33.75 - samples/sec: 3392.11 - lr: 0.000035 - momentum: 0.000000
2023-10-13 17:47:42,768 epoch 1 - iter 584/738 - loss 0.65785367 - time (sec): 38.60 - samples/sec: 3381.31 - lr: 0.000039 - momentum: 0.000000
2023-10-13 17:47:47,675 epoch 1 - iter 657/738 - loss 0.60430136 - time (sec): 43.51 - samples/sec: 3373.75 - lr: 0.000044 - momentum: 0.000000
2023-10-13 17:47:53,061 epoch 1 - iter 730/738 - loss 0.55420001 - time (sec): 48.90 - samples/sec: 3371.78 - lr: 0.000049 - momentum: 0.000000
2023-10-13 17:47:53,539 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:53,540 EPOCH 1 done: loss 0.5507 - lr: 0.000049
2023-10-13 17:47:59,706 DEV : loss 0.12785974144935608 - f1-score (micro avg) 0.7131
2023-10-13 17:47:59,734 saving best model
2023-10-13 17:48:00,205 ----------------------------------------------------------------------------------------------------
2023-10-13 17:48:04,704 epoch 2 - iter 73/738 - loss 0.14031504 - time (sec): 4.50 - samples/sec: 3262.14 - lr: 0.000049 - momentum: 0.000000
2023-10-13 17:48:09,192 epoch 2 - iter 146/738 - loss 0.13796547 - time (sec): 8.99 - samples/sec: 3313.17 - lr: 0.000049 - momentum: 0.000000
2023-10-13 17:48:14,464 epoch 2 - iter 219/738 - loss 0.13466397 - time (sec): 14.26 - samples/sec: 3360.71 - lr: 0.000048 - momentum: 0.000000
2023-10-13 17:48:19,296 epoch 2 - iter 292/738 - loss 0.13225824 - time (sec): 19.09 - samples/sec: 3355.88 - lr: 0.000048 - momentum: 0.000000
2023-10-13 17:48:24,137 epoch 2 - iter 365/738 - loss 0.13022111 - time (sec): 23.93 - samples/sec: 3350.86 - lr: 0.000047 - momentum: 0.000000
2023-10-13 17:48:29,160 epoch 2 - iter 438/738 - loss 0.12720644 - time (sec): 28.95 - samples/sec: 3364.48 - lr: 0.000047 - momentum: 0.000000
2023-10-13 17:48:34,186 epoch 2 - iter 511/738 - loss 0.12457501 - time (sec): 33.98 - samples/sec: 3340.81 - lr: 0.000046 - momentum: 0.000000
2023-10-13 17:48:38,951 epoch 2 - iter 584/738 - loss 0.12421710 - time (sec): 38.74 - samples/sec: 3349.70 - lr: 0.000046 - momentum: 0.000000
2023-10-13 17:48:44,401 epoch 2 - iter 657/738 - loss 0.12162126 - time (sec): 44.19 - samples/sec: 3350.95 - lr: 0.000045 - momentum: 0.000000
2023-10-13 17:48:49,387 epoch 2 - iter 730/738 - loss 0.11931305 - time (sec): 49.18 - samples/sec: 3349.23 - lr: 0.000045 - momentum: 0.000000
2023-10-13 17:48:49,874 ----------------------------------------------------------------------------------------------------
2023-10-13 17:48:49,874 EPOCH 2 done: loss 0.1189 - lr: 0.000045
2023-10-13 17:49:01,090 DEV : loss 0.13197503983974457 - f1-score (micro avg) 0.7308
2023-10-13 17:49:01,119 saving best model
2023-10-13 17:49:01,598 ----------------------------------------------------------------------------------------------------
2023-10-13 17:49:06,572 epoch 3 - iter 73/738 - loss 0.07614814 - time (sec): 4.97 - samples/sec: 3274.66 - lr: 0.000044 - momentum: 0.000000
2023-10-13 17:49:11,815 epoch 3 - iter 146/738 - loss 0.07670962 - time (sec): 10.21 - samples/sec: 3311.41 - lr: 0.000043 - momentum: 0.000000
2023-10-13 17:49:16,339 epoch 3 - iter 219/738 - loss 0.07615619 - time (sec): 14.74 - samples/sec: 3339.95 - lr: 0.000043 - momentum: 0.000000
2023-10-13 17:49:21,598 epoch 3 - iter 292/738 - loss 0.08533494 - time (sec): 20.00 - samples/sec: 3355.66 - lr: 0.000042 - momentum: 0.000000
2023-10-13 17:49:26,329 epoch 3 - iter 365/738 - loss 0.08052962 - time (sec): 24.73 - samples/sec: 3350.28 - lr: 0.000042 - momentum: 0.000000
2023-10-13 17:49:31,257 epoch 3 - iter 438/738 - loss 0.07707434 - time (sec): 29.65 - samples/sec: 3330.19 - lr: 0.000041 - momentum: 0.000000
2023-10-13 17:49:36,060 epoch 3 - iter 511/738 - loss 0.07639684 - time (sec): 34.46 - samples/sec: 3344.43 - lr: 0.000041 - momentum: 0.000000
2023-10-13 17:49:41,408 epoch 3 - iter 584/738 - loss 0.07393004 - time (sec): 39.80 - samples/sec: 3333.01 - lr: 0.000040 - momentum: 0.000000
2023-10-13 17:49:46,311 epoch 3 - iter 657/738 - loss 0.07260392 - time (sec): 44.71 - samples/sec: 3315.95 - lr: 0.000040 - momentum: 0.000000
2023-10-13 17:49:51,524 epoch 3 - iter 730/738 - loss 0.07247522 - time (sec): 49.92 - samples/sec: 3305.85 - lr: 0.000039 - momentum: 0.000000
2023-10-13 17:49:51,967 ----------------------------------------------------------------------------------------------------
2023-10-13 17:49:51,967 EPOCH 3 done: loss 0.0725 - lr: 0.000039
2023-10-13 17:50:03,343 DEV : loss 0.1486140638589859 - f1-score (micro avg) 0.7833
2023-10-13 17:50:03,372 saving best model
2023-10-13 17:50:03,852 ----------------------------------------------------------------------------------------------------
2023-10-13 17:50:09,135 epoch 4 - iter 73/738 - loss 0.05135216 - time (sec): 5.28 - samples/sec: 3383.66 - lr: 0.000038 - momentum: 0.000000
2023-10-13 17:50:13,787 epoch 4 - iter 146/738 - loss 0.05140785 - time (sec): 9.93 - samples/sec: 3335.68 - lr: 0.000038 - momentum: 0.000000
2023-10-13 17:50:19,538 epoch 4 - iter 219/738 - loss 0.04837867 - time (sec): 15.68 - samples/sec: 3374.63 - lr: 0.000037 - momentum: 0.000000
2023-10-13 17:50:24,656 epoch 4 - iter 292/738 - loss 0.05298887 - time (sec): 20.80 - samples/sec: 3349.51 - lr: 0.000037 - momentum: 0.000000
2023-10-13 17:50:29,288 epoch 4 - iter 365/738 - loss 0.05192526 - time (sec): 25.43 - samples/sec: 3358.42 - lr: 0.000036 - momentum: 0.000000
2023-10-13 17:50:34,582 epoch 4 - iter 438/738 - loss 0.05297484 - time (sec): 30.72 - samples/sec: 3368.28 - lr: 0.000036 - momentum: 0.000000
2023-10-13 17:50:39,204 epoch 4 - iter 511/738 - loss 0.05272183 - time (sec): 35.34 - samples/sec: 3368.87 - lr: 0.000035 - momentum: 0.000000
2023-10-13 17:50:43,882 epoch 4 - iter 584/738 - loss 0.05406746 - time (sec): 40.02 - samples/sec: 3349.33 - lr: 0.000035 - momentum: 0.000000
2023-10-13 17:50:48,214 epoch 4 - iter 657/738 - loss 0.05441271 - time (sec): 44.35 - samples/sec: 3352.26 - lr: 0.000034 - momentum: 0.000000
2023-10-13 17:50:52,924 epoch 4 - iter 730/738 - loss 0.05358667 - time (sec): 49.06 - samples/sec: 3359.88 - lr: 0.000033 - momentum: 0.000000
2023-10-13 17:50:53,379 ----------------------------------------------------------------------------------------------------
2023-10-13 17:50:53,379 EPOCH 4 done: loss 0.0533 - lr: 0.000033
2023-10-13 17:51:04,589 DEV : loss 0.1737738847732544 - f1-score (micro avg) 0.8049
2023-10-13 17:51:04,619 saving best model
2023-10-13 17:51:05,140 ----------------------------------------------------------------------------------------------------
2023-10-13 17:51:10,235 epoch 5 - iter 73/738 - loss 0.03032376 - time (sec): 5.09 - samples/sec: 3312.13 - lr: 0.000033 - momentum: 0.000000
2023-10-13 17:51:14,799 epoch 5 - iter 146/738 - loss 0.03581647 - time (sec): 9.65 - samples/sec: 3328.98 - lr: 0.000032 - momentum: 0.000000
2023-10-13 17:51:19,339 epoch 5 - iter 219/738 - loss 0.03520240 - time (sec): 14.19 - samples/sec: 3390.73 - lr: 0.000032 - momentum: 0.000000
2023-10-13 17:51:24,352 epoch 5 - iter 292/738 - loss 0.03677044 - time (sec): 19.21 - samples/sec: 3407.81 - lr: 0.000031 - momentum: 0.000000
2023-10-13 17:51:29,378 epoch 5 - iter 365/738 - loss 0.03507287 - time (sec): 24.23 - samples/sec: 3366.45 - lr: 0.000031 - momentum: 0.000000
2023-10-13 17:51:34,241 epoch 5 - iter 438/738 - loss 0.03455833 - time (sec): 29.10 - samples/sec: 3355.44 - lr: 0.000030 - momentum: 0.000000
2023-10-13 17:51:39,901 epoch 5 - iter 511/738 - loss 0.03549297 - time (sec): 34.76 - samples/sec: 3357.51 - lr: 0.000030 - momentum: 0.000000
2023-10-13 17:51:44,117 epoch 5 - iter 584/738 - loss 0.03649335 - time (sec): 38.97 - samples/sec: 3377.19 - lr: 0.000029 - momentum: 0.000000
2023-10-13 17:51:49,172 epoch 5 - iter 657/738 - loss 0.03572356 - time (sec): 44.03 - samples/sec: 3376.14 - lr: 0.000028 - momentum: 0.000000
2023-10-13 17:51:53,851 epoch 5 - iter 730/738 - loss 0.03593262 - time (sec): 48.71 - samples/sec: 3383.72 - lr: 0.000028 - momentum: 0.000000
2023-10-13 17:51:54,287 ----------------------------------------------------------------------------------------------------
2023-10-13 17:51:54,287 EPOCH 5 done: loss 0.0360 - lr: 0.000028
2023-10-13 17:52:05,499 DEV : loss 0.1812753677368164 - f1-score (micro avg) 0.8177
2023-10-13 17:52:05,531 saving best model
2023-10-13 17:52:06,113 ----------------------------------------------------------------------------------------------------
2023-10-13 17:52:11,732 epoch 6 - iter 73/738 - loss 0.01621660 - time (sec): 5.61 - samples/sec: 3000.04 - lr: 0.000027 - momentum: 0.000000
2023-10-13 17:52:16,775 epoch 6 - iter 146/738 - loss 0.02080977 - time (sec): 10.66 - samples/sec: 3100.09 - lr: 0.000027 - momentum: 0.000000
2023-10-13 17:52:21,263 epoch 6 - iter 219/738 - loss 0.01876135 - time (sec): 15.14 - samples/sec: 3144.78 - lr: 0.000026 - momentum: 0.000000
2023-10-13 17:52:25,848 epoch 6 - iter 292/738 - loss 0.02138464 - time (sec): 19.73 - samples/sec: 3183.99 - lr: 0.000026 - momentum: 0.000000
2023-10-13 17:52:31,015 epoch 6 - iter 365/738 - loss 0.01978816 - time (sec): 24.90 - samples/sec: 3209.75 - lr: 0.000025 - momentum: 0.000000
2023-10-13 17:52:35,258 epoch 6 - iter 438/738 - loss 0.01938038 - time (sec): 29.14 - samples/sec: 3225.51 - lr: 0.000025 - momentum: 0.000000
2023-10-13 17:52:40,274 epoch 6 - iter 511/738 - loss 0.01928215 - time (sec): 34.16 - samples/sec: 3254.34 - lr: 0.000024 - momentum: 0.000000
2023-10-13 17:52:45,579 epoch 6 - iter 584/738 - loss 0.01987479 - time (sec): 39.46 - samples/sec: 3280.26 - lr: 0.000023 - momentum: 0.000000
2023-10-13 17:52:51,330 epoch 6 - iter 657/738 - loss 0.02178292 - time (sec): 45.21 - samples/sec: 3297.22 - lr: 0.000023 - momentum: 0.000000
2023-10-13 17:52:56,145 epoch 6 - iter 730/738 - loss 0.02282192 - time (sec): 50.03 - samples/sec: 3300.32 - lr: 0.000022 - momentum: 0.000000
2023-10-13 17:52:56,552 ----------------------------------------------------------------------------------------------------
2023-10-13 17:52:56,552 EPOCH 6 done: loss 0.0228 - lr: 0.000022
2023-10-13 17:53:07,779 DEV : loss 0.21827659010887146 - f1-score (micro avg) 0.7988
2023-10-13 17:53:07,809 ----------------------------------------------------------------------------------------------------
2023-10-13 17:53:12,402 epoch 7 - iter 73/738 - loss 0.01465345 - time (sec): 4.59 - samples/sec: 3353.90 - lr: 0.000022 - momentum: 0.000000
2023-10-13 17:53:16,861 epoch 7 - iter 146/738 - loss 0.01573536 - time (sec): 9.05 - samples/sec: 3298.73 - lr: 0.000021 - momentum: 0.000000
2023-10-13 17:53:21,855 epoch 7 - iter 219/738 - loss 0.01844721 - time (sec): 14.04 - samples/sec: 3360.00 - lr: 0.000021 - momentum: 0.000000
2023-10-13 17:53:26,583 epoch 7 - iter 292/738 - loss 0.01755483 - time (sec): 18.77 - samples/sec: 3348.38 - lr: 0.000020 - momentum: 0.000000
2023-10-13 17:53:31,556 epoch 7 - iter 365/738 - loss 0.01732233 - time (sec): 23.75 - samples/sec: 3348.92 - lr: 0.000020 - momentum: 0.000000
2023-10-13 17:53:36,455 epoch 7 - iter 438/738 - loss 0.01872450 - time (sec): 28.64 - samples/sec: 3348.83 - lr: 0.000019 - momentum: 0.000000
2023-10-13 17:53:41,246 epoch 7 - iter 511/738 - loss 0.01801696 - time (sec): 33.44 - samples/sec: 3353.96 - lr: 0.000018 - momentum: 0.000000
2023-10-13 17:53:46,182 epoch 7 - iter 584/738 - loss 0.01924045 - time (sec): 38.37 - samples/sec: 3350.46 - lr: 0.000018 - momentum: 0.000000
2023-10-13 17:53:51,818 epoch 7 - iter 657/738 - loss 0.01897246 - time (sec): 44.01 - samples/sec: 3363.15 - lr: 0.000017 - momentum: 0.000000
2023-10-13 17:53:56,775 epoch 7 - iter 730/738 - loss 0.01878918 - time (sec): 48.96 - samples/sec: 3357.79 - lr: 0.000017 - momentum: 0.000000
2023-10-13 17:53:57,373 ----------------------------------------------------------------------------------------------------
2023-10-13 17:53:57,373 EPOCH 7 done: loss 0.0186 - lr: 0.000017
2023-10-13 17:54:08,578 DEV : loss 0.20159663259983063 - f1-score (micro avg) 0.8255
2023-10-13 17:54:08,607 saving best model
2023-10-13 17:54:09,182 ----------------------------------------------------------------------------------------------------
2023-10-13 17:54:14,384 epoch 8 - iter 73/738 - loss 0.00867283 - time (sec): 5.20 - samples/sec: 3376.39 - lr: 0.000016 - momentum: 0.000000
2023-10-13 17:54:18,924 epoch 8 - iter 146/738 - loss 0.00929264 - time (sec): 9.74 - samples/sec: 3338.85 - lr: 0.000016 - momentum: 0.000000
2023-10-13 17:54:23,900 epoch 8 - iter 219/738 - loss 0.00988928 - time (sec): 14.71 - samples/sec: 3360.24 - lr: 0.000015 - momentum: 0.000000
2023-10-13 17:54:28,467 epoch 8 - iter 292/738 - loss 0.01078611 - time (sec): 19.28 - samples/sec: 3357.48 - lr: 0.000015 - momentum: 0.000000
2023-10-13 17:54:33,531 epoch 8 - iter 365/738 - loss 0.01188262 - time (sec): 24.34 - samples/sec: 3332.34 - lr: 0.000014 - momentum: 0.000000
2023-10-13 17:54:39,068 epoch 8 - iter 438/738 - loss 0.01168210 - time (sec): 29.88 - samples/sec: 3313.89 - lr: 0.000013 - momentum: 0.000000
2023-10-13 17:54:43,328 epoch 8 - iter 511/738 - loss 0.01108505 - time (sec): 34.14 - samples/sec: 3334.40 - lr: 0.000013 - momentum: 0.000000
2023-10-13 17:54:48,539 epoch 8 - iter 584/738 - loss 0.01139473 - time (sec): 39.35 - samples/sec: 3327.81 - lr: 0.000012 - momentum: 0.000000
2023-10-13 17:54:53,175 epoch 8 - iter 657/738 - loss 0.01059925 - time (sec): 43.99 - samples/sec: 3333.14 - lr: 0.000012 - momentum: 0.000000
2023-10-13 17:54:58,385 epoch 8 - iter 730/738 - loss 0.01213062 - time (sec): 49.20 - samples/sec: 3351.79 - lr: 0.000011 - momentum: 0.000000
2023-10-13 17:54:58,847 ----------------------------------------------------------------------------------------------------
2023-10-13 17:54:58,847 EPOCH 8 done: loss 0.0120 - lr: 0.000011
2023-10-13 17:55:10,116 DEV : loss 0.2121274471282959 - f1-score (micro avg) 0.8167
2023-10-13 17:55:10,146 ----------------------------------------------------------------------------------------------------
2023-10-13 17:55:15,100 epoch 9 - iter 73/738 - loss 0.00785780 - time (sec): 4.95 - samples/sec: 3384.77 - lr: 0.000011 - momentum: 0.000000
2023-10-13 17:55:20,217 epoch 9 - iter 146/738 - loss 0.00968859 - time (sec): 10.07 - samples/sec: 3323.66 - lr: 0.000010 - momentum: 0.000000
2023-10-13 17:55:24,538 epoch 9 - iter 219/738 - loss 0.00766936 - time (sec): 14.39 - samples/sec: 3355.87 - lr: 0.000010 - momentum: 0.000000
2023-10-13 17:55:29,150 epoch 9 - iter 292/738 - loss 0.00776847 - time (sec): 19.00 - samples/sec: 3343.63 - lr: 0.000009 - momentum: 0.000000
2023-10-13 17:55:34,169 epoch 9 - iter 365/738 - loss 0.00786568 - time (sec): 24.02 - samples/sec: 3303.61 - lr: 0.000008 - momentum: 0.000000
2023-10-13 17:55:39,513 epoch 9 - iter 438/738 - loss 0.00805319 - time (sec): 29.37 - samples/sec: 3304.31 - lr: 0.000008 - momentum: 0.000000
2023-10-13 17:55:44,830 epoch 9 - iter 511/738 - loss 0.00739363 - time (sec): 34.68 - samples/sec: 3308.13 - lr: 0.000007 - momentum: 0.000000
2023-10-13 17:55:49,338 epoch 9 - iter 584/738 - loss 0.00731685 - time (sec): 39.19 - samples/sec: 3324.15 - lr: 0.000007 - momentum: 0.000000
2023-10-13 17:55:54,064 epoch 9 - iter 657/738 - loss 0.00765869 - time (sec): 43.92 - samples/sec: 3324.65 - lr: 0.000006 - momentum: 0.000000
2023-10-13 17:55:59,126 epoch 9 - iter 730/738 - loss 0.00759718 - time (sec): 48.98 - samples/sec: 3359.39 - lr: 0.000006 - momentum: 0.000000
2023-10-13 17:55:59,614 ----------------------------------------------------------------------------------------------------
2023-10-13 17:55:59,614 EPOCH 9 done: loss 0.0075 - lr: 0.000006
2023-10-13 17:56:10,875 DEV : loss 0.22374621033668518 - f1-score (micro avg) 0.8242
2023-10-13 17:56:10,904 ----------------------------------------------------------------------------------------------------
2023-10-13 17:56:16,195 epoch 10 - iter 73/738 - loss 0.00432633 - time (sec): 5.29 - samples/sec: 3017.62 - lr: 0.000005 - momentum: 0.000000
2023-10-13 17:56:21,075 epoch 10 - iter 146/738 - loss 0.00341698 - time (sec): 10.17 - samples/sec: 3203.85 - lr: 0.000004 - momentum: 0.000000
2023-10-13 17:56:25,457 epoch 10 - iter 219/738 - loss 0.00467938 - time (sec): 14.55 - samples/sec: 3261.52 - lr: 0.000004 - momentum: 0.000000
2023-10-13 17:56:30,710 epoch 10 - iter 292/738 - loss 0.00480875 - time (sec): 19.81 - samples/sec: 3313.14 - lr: 0.000003 - momentum: 0.000000
2023-10-13 17:56:36,255 epoch 10 - iter 365/738 - loss 0.00575497 - time (sec): 25.35 - samples/sec: 3311.60 - lr: 0.000003 - momentum: 0.000000
2023-10-13 17:56:40,976 epoch 10 - iter 438/738 - loss 0.00574872 - time (sec): 30.07 - samples/sec: 3312.48 - lr: 0.000002 - momentum: 0.000000
2023-10-13 17:56:45,946 epoch 10 - iter 511/738 - loss 0.00539595 - time (sec): 35.04 - samples/sec: 3329.58 - lr: 0.000002 - momentum: 0.000000
2023-10-13 17:56:51,370 epoch 10 - iter 584/738 - loss 0.00518260 - time (sec): 40.47 - samples/sec: 3321.26 - lr: 0.000001 - momentum: 0.000000
2023-10-13 17:56:56,106 epoch 10 - iter 657/738 - loss 0.00512442 - time (sec): 45.20 - samples/sec: 3321.97 - lr: 0.000001 - momentum: 0.000000
2023-10-13 17:57:00,525 epoch 10 - iter 730/738 - loss 0.00499521 - time (sec): 49.62 - samples/sec: 3319.82 - lr: 0.000000 - momentum: 0.000000
2023-10-13 17:57:00,998 ----------------------------------------------------------------------------------------------------
2023-10-13 17:57:00,999 EPOCH 10 done: loss 0.0049 - lr: 0.000000
2023-10-13 17:57:12,280 DEV : loss 0.22519326210021973 - f1-score (micro avg) 0.8266
2023-10-13 17:57:12,310 saving best model
2023-10-13 17:57:13,140 ----------------------------------------------------------------------------------------------------
2023-10-13 17:57:13,141 Loading model from best epoch ...
2023-10-13 17:57:14,542 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-13 17:57:20,591
Results:
- F-score (micro) 0.8013
- F-score (macro) 0.7071
- Accuracy 0.6949
By class:
precision recall f1-score support
loc 0.8622 0.8823 0.8721 858
pers 0.7549 0.7970 0.7754 537
org 0.5652 0.5909 0.5778 132
time 0.5484 0.6296 0.5862 54
prod 0.7636 0.6885 0.7241 61
micro avg 0.7876 0.8155 0.8013 1642
macro avg 0.6989 0.7177 0.7071 1642
weighted avg 0.7892 0.8155 0.8019 1642
2023-10-13 17:57:20,591 ----------------------------------------------------------------------------------------------------