stefan-it's picture
Upload folder using huggingface_hub
f745937
2023-10-14 01:06:03,859 ----------------------------------------------------------------------------------------------------
2023-10-14 01:06:03,860 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 01:06:03,860 ----------------------------------------------------------------------------------------------------
2023-10-14 01:06:03,860 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-14 01:06:03,860 ----------------------------------------------------------------------------------------------------
2023-10-14 01:06:03,860 Train: 7936 sentences
2023-10-14 01:06:03,860 (train_with_dev=False, train_with_test=False)
2023-10-14 01:06:03,860 ----------------------------------------------------------------------------------------------------
2023-10-14 01:06:03,860 Training Params:
2023-10-14 01:06:03,860 - learning_rate: "3e-05"
2023-10-14 01:06:03,860 - mini_batch_size: "8"
2023-10-14 01:06:03,860 - max_epochs: "10"
2023-10-14 01:06:03,860 - shuffle: "True"
2023-10-14 01:06:03,860 ----------------------------------------------------------------------------------------------------
2023-10-14 01:06:03,860 Plugins:
2023-10-14 01:06:03,860 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 01:06:03,860 ----------------------------------------------------------------------------------------------------
2023-10-14 01:06:03,861 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 01:06:03,861 - metric: "('micro avg', 'f1-score')"
2023-10-14 01:06:03,861 ----------------------------------------------------------------------------------------------------
2023-10-14 01:06:03,861 Computation:
2023-10-14 01:06:03,861 - compute on device: cuda:0
2023-10-14 01:06:03,861 - embedding storage: none
2023-10-14 01:06:03,861 ----------------------------------------------------------------------------------------------------
2023-10-14 01:06:03,861 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-14 01:06:03,861 ----------------------------------------------------------------------------------------------------
2023-10-14 01:06:03,861 ----------------------------------------------------------------------------------------------------
2023-10-14 01:06:09,367 epoch 1 - iter 99/992 - loss 2.17471007 - time (sec): 5.51 - samples/sec: 2804.90 - lr: 0.000003 - momentum: 0.000000
2023-10-14 01:06:15,065 epoch 1 - iter 198/992 - loss 1.30549124 - time (sec): 11.20 - samples/sec: 2813.64 - lr: 0.000006 - momentum: 0.000000
2023-10-14 01:06:20,977 epoch 1 - iter 297/992 - loss 0.95601849 - time (sec): 17.11 - samples/sec: 2810.76 - lr: 0.000009 - momentum: 0.000000
2023-10-14 01:06:26,541 epoch 1 - iter 396/992 - loss 0.76749053 - time (sec): 22.68 - samples/sec: 2838.05 - lr: 0.000012 - momentum: 0.000000
2023-10-14 01:06:32,406 epoch 1 - iter 495/992 - loss 0.65224931 - time (sec): 28.54 - samples/sec: 2831.49 - lr: 0.000015 - momentum: 0.000000
2023-10-14 01:06:38,300 epoch 1 - iter 594/992 - loss 0.56475688 - time (sec): 34.44 - samples/sec: 2841.74 - lr: 0.000018 - momentum: 0.000000
2023-10-14 01:06:44,094 epoch 1 - iter 693/992 - loss 0.50715690 - time (sec): 40.23 - samples/sec: 2827.09 - lr: 0.000021 - momentum: 0.000000
2023-10-14 01:06:50,014 epoch 1 - iter 792/992 - loss 0.46053485 - time (sec): 46.15 - samples/sec: 2818.19 - lr: 0.000024 - momentum: 0.000000
2023-10-14 01:06:55,862 epoch 1 - iter 891/992 - loss 0.42443987 - time (sec): 52.00 - samples/sec: 2813.04 - lr: 0.000027 - momentum: 0.000000
2023-10-14 01:07:01,950 epoch 1 - iter 990/992 - loss 0.39381793 - time (sec): 58.09 - samples/sec: 2813.05 - lr: 0.000030 - momentum: 0.000000
2023-10-14 01:07:02,157 ----------------------------------------------------------------------------------------------------
2023-10-14 01:07:02,157 EPOCH 1 done: loss 0.3930 - lr: 0.000030
2023-10-14 01:07:05,244 DEV : loss 0.09621600061655045 - f1-score (micro avg) 0.6851
2023-10-14 01:07:05,264 saving best model
2023-10-14 01:07:05,661 ----------------------------------------------------------------------------------------------------
2023-10-14 01:07:11,275 epoch 2 - iter 99/992 - loss 0.13490577 - time (sec): 5.61 - samples/sec: 2709.42 - lr: 0.000030 - momentum: 0.000000
2023-10-14 01:07:17,108 epoch 2 - iter 198/992 - loss 0.11610667 - time (sec): 11.45 - samples/sec: 2725.35 - lr: 0.000029 - momentum: 0.000000
2023-10-14 01:07:22,696 epoch 2 - iter 297/992 - loss 0.11416877 - time (sec): 17.03 - samples/sec: 2766.88 - lr: 0.000029 - momentum: 0.000000
2023-10-14 01:07:28,612 epoch 2 - iter 396/992 - loss 0.10769612 - time (sec): 22.95 - samples/sec: 2773.46 - lr: 0.000029 - momentum: 0.000000
2023-10-14 01:07:34,383 epoch 2 - iter 495/992 - loss 0.10656848 - time (sec): 28.72 - samples/sec: 2814.53 - lr: 0.000028 - momentum: 0.000000
2023-10-14 01:07:40,280 epoch 2 - iter 594/992 - loss 0.10563479 - time (sec): 34.62 - samples/sec: 2823.23 - lr: 0.000028 - momentum: 0.000000
2023-10-14 01:07:46,122 epoch 2 - iter 693/992 - loss 0.10497520 - time (sec): 40.46 - samples/sec: 2823.34 - lr: 0.000028 - momentum: 0.000000
2023-10-14 01:07:51,931 epoch 2 - iter 792/992 - loss 0.10311384 - time (sec): 46.27 - samples/sec: 2814.64 - lr: 0.000027 - momentum: 0.000000
2023-10-14 01:07:58,133 epoch 2 - iter 891/992 - loss 0.10194024 - time (sec): 52.47 - samples/sec: 2802.48 - lr: 0.000027 - momentum: 0.000000
2023-10-14 01:08:03,960 epoch 2 - iter 990/992 - loss 0.10262038 - time (sec): 58.30 - samples/sec: 2804.64 - lr: 0.000027 - momentum: 0.000000
2023-10-14 01:08:04,121 ----------------------------------------------------------------------------------------------------
2023-10-14 01:08:04,121 EPOCH 2 done: loss 0.1025 - lr: 0.000027
2023-10-14 01:08:07,983 DEV : loss 0.08396855741739273 - f1-score (micro avg) 0.7416
2023-10-14 01:08:08,004 saving best model
2023-10-14 01:08:08,517 ----------------------------------------------------------------------------------------------------
2023-10-14 01:08:14,189 epoch 3 - iter 99/992 - loss 0.06551401 - time (sec): 5.67 - samples/sec: 2662.64 - lr: 0.000026 - momentum: 0.000000
2023-10-14 01:08:20,258 epoch 3 - iter 198/992 - loss 0.06924244 - time (sec): 11.74 - samples/sec: 2763.88 - lr: 0.000026 - momentum: 0.000000
2023-10-14 01:08:25,798 epoch 3 - iter 297/992 - loss 0.07056573 - time (sec): 17.28 - samples/sec: 2771.16 - lr: 0.000026 - momentum: 0.000000
2023-10-14 01:08:31,708 epoch 3 - iter 396/992 - loss 0.07055584 - time (sec): 23.19 - samples/sec: 2748.52 - lr: 0.000025 - momentum: 0.000000
2023-10-14 01:08:37,722 epoch 3 - iter 495/992 - loss 0.06873774 - time (sec): 29.20 - samples/sec: 2788.98 - lr: 0.000025 - momentum: 0.000000
2023-10-14 01:08:43,529 epoch 3 - iter 594/992 - loss 0.07045008 - time (sec): 35.01 - samples/sec: 2794.24 - lr: 0.000025 - momentum: 0.000000
2023-10-14 01:08:49,391 epoch 3 - iter 693/992 - loss 0.07037162 - time (sec): 40.87 - samples/sec: 2803.95 - lr: 0.000024 - momentum: 0.000000
2023-10-14 01:08:55,506 epoch 3 - iter 792/992 - loss 0.07021216 - time (sec): 46.99 - samples/sec: 2796.17 - lr: 0.000024 - momentum: 0.000000
2023-10-14 01:09:01,195 epoch 3 - iter 891/992 - loss 0.06994634 - time (sec): 52.68 - samples/sec: 2790.15 - lr: 0.000024 - momentum: 0.000000
2023-10-14 01:09:06,919 epoch 3 - iter 990/992 - loss 0.06967297 - time (sec): 58.40 - samples/sec: 2801.51 - lr: 0.000023 - momentum: 0.000000
2023-10-14 01:09:07,048 ----------------------------------------------------------------------------------------------------
2023-10-14 01:09:07,048 EPOCH 3 done: loss 0.0696 - lr: 0.000023
2023-10-14 01:09:10,503 DEV : loss 0.11555210500955582 - f1-score (micro avg) 0.7446
2023-10-14 01:09:10,523 saving best model
2023-10-14 01:09:11,025 ----------------------------------------------------------------------------------------------------
2023-10-14 01:09:16,953 epoch 4 - iter 99/992 - loss 0.03972797 - time (sec): 5.93 - samples/sec: 2955.76 - lr: 0.000023 - momentum: 0.000000
2023-10-14 01:09:22,805 epoch 4 - iter 198/992 - loss 0.04570840 - time (sec): 11.78 - samples/sec: 2867.88 - lr: 0.000023 - momentum: 0.000000
2023-10-14 01:09:28,524 epoch 4 - iter 297/992 - loss 0.04904627 - time (sec): 17.50 - samples/sec: 2862.88 - lr: 0.000022 - momentum: 0.000000
2023-10-14 01:09:34,474 epoch 4 - iter 396/992 - loss 0.04945405 - time (sec): 23.45 - samples/sec: 2817.49 - lr: 0.000022 - momentum: 0.000000
2023-10-14 01:09:40,476 epoch 4 - iter 495/992 - loss 0.04831346 - time (sec): 29.45 - samples/sec: 2809.35 - lr: 0.000022 - momentum: 0.000000
2023-10-14 01:09:46,365 epoch 4 - iter 594/992 - loss 0.04825902 - time (sec): 35.34 - samples/sec: 2794.59 - lr: 0.000021 - momentum: 0.000000
2023-10-14 01:09:52,103 epoch 4 - iter 693/992 - loss 0.04856953 - time (sec): 41.08 - samples/sec: 2782.77 - lr: 0.000021 - momentum: 0.000000
2023-10-14 01:09:57,796 epoch 4 - iter 792/992 - loss 0.04919667 - time (sec): 46.77 - samples/sec: 2790.01 - lr: 0.000021 - momentum: 0.000000
2023-10-14 01:10:03,524 epoch 4 - iter 891/992 - loss 0.04895947 - time (sec): 52.50 - samples/sec: 2784.48 - lr: 0.000020 - momentum: 0.000000
2023-10-14 01:10:09,641 epoch 4 - iter 990/992 - loss 0.05170296 - time (sec): 58.62 - samples/sec: 2792.06 - lr: 0.000020 - momentum: 0.000000
2023-10-14 01:10:09,813 ----------------------------------------------------------------------------------------------------
2023-10-14 01:10:09,813 EPOCH 4 done: loss 0.0517 - lr: 0.000020
2023-10-14 01:10:13,733 DEV : loss 0.11588922142982483 - f1-score (micro avg) 0.7508
2023-10-14 01:10:13,754 saving best model
2023-10-14 01:10:14,263 ----------------------------------------------------------------------------------------------------
2023-10-14 01:10:20,108 epoch 5 - iter 99/992 - loss 0.03385090 - time (sec): 5.84 - samples/sec: 2833.18 - lr: 0.000020 - momentum: 0.000000
2023-10-14 01:10:25,922 epoch 5 - iter 198/992 - loss 0.03715409 - time (sec): 11.66 - samples/sec: 2861.38 - lr: 0.000019 - momentum: 0.000000
2023-10-14 01:10:31,745 epoch 5 - iter 297/992 - loss 0.03954664 - time (sec): 17.48 - samples/sec: 2820.44 - lr: 0.000019 - momentum: 0.000000
2023-10-14 01:10:37,565 epoch 5 - iter 396/992 - loss 0.03785856 - time (sec): 23.30 - samples/sec: 2828.39 - lr: 0.000019 - momentum: 0.000000
2023-10-14 01:10:43,326 epoch 5 - iter 495/992 - loss 0.03770491 - time (sec): 29.06 - samples/sec: 2832.89 - lr: 0.000018 - momentum: 0.000000
2023-10-14 01:10:49,025 epoch 5 - iter 594/992 - loss 0.03881175 - time (sec): 34.76 - samples/sec: 2844.93 - lr: 0.000018 - momentum: 0.000000
2023-10-14 01:10:54,554 epoch 5 - iter 693/992 - loss 0.04006727 - time (sec): 40.29 - samples/sec: 2834.78 - lr: 0.000018 - momentum: 0.000000
2023-10-14 01:11:00,484 epoch 5 - iter 792/992 - loss 0.04022835 - time (sec): 46.22 - samples/sec: 2839.18 - lr: 0.000017 - momentum: 0.000000
2023-10-14 01:11:06,349 epoch 5 - iter 891/992 - loss 0.03964020 - time (sec): 52.08 - samples/sec: 2827.37 - lr: 0.000017 - momentum: 0.000000
2023-10-14 01:11:12,062 epoch 5 - iter 990/992 - loss 0.03963685 - time (sec): 57.80 - samples/sec: 2832.44 - lr: 0.000017 - momentum: 0.000000
2023-10-14 01:11:12,174 ----------------------------------------------------------------------------------------------------
2023-10-14 01:11:12,174 EPOCH 5 done: loss 0.0396 - lr: 0.000017
2023-10-14 01:11:15,549 DEV : loss 0.149429589509964 - f1-score (micro avg) 0.7512
2023-10-14 01:11:15,571 saving best model
2023-10-14 01:11:16,077 ----------------------------------------------------------------------------------------------------
2023-10-14 01:11:22,503 epoch 6 - iter 99/992 - loss 0.03207299 - time (sec): 6.42 - samples/sec: 2699.08 - lr: 0.000016 - momentum: 0.000000
2023-10-14 01:11:28,059 epoch 6 - iter 198/992 - loss 0.03319848 - time (sec): 11.98 - samples/sec: 2789.57 - lr: 0.000016 - momentum: 0.000000
2023-10-14 01:11:33,724 epoch 6 - iter 297/992 - loss 0.03023090 - time (sec): 17.64 - samples/sec: 2790.07 - lr: 0.000016 - momentum: 0.000000
2023-10-14 01:11:39,537 epoch 6 - iter 396/992 - loss 0.03113076 - time (sec): 23.45 - samples/sec: 2808.14 - lr: 0.000015 - momentum: 0.000000
2023-10-14 01:11:45,406 epoch 6 - iter 495/992 - loss 0.03125986 - time (sec): 29.32 - samples/sec: 2812.25 - lr: 0.000015 - momentum: 0.000000
2023-10-14 01:11:51,038 epoch 6 - iter 594/992 - loss 0.03070883 - time (sec): 34.95 - samples/sec: 2815.40 - lr: 0.000015 - momentum: 0.000000
2023-10-14 01:11:56,908 epoch 6 - iter 693/992 - loss 0.03147804 - time (sec): 40.82 - samples/sec: 2809.72 - lr: 0.000014 - momentum: 0.000000
2023-10-14 01:12:02,821 epoch 6 - iter 792/992 - loss 0.03114004 - time (sec): 46.74 - samples/sec: 2805.07 - lr: 0.000014 - momentum: 0.000000
2023-10-14 01:12:08,954 epoch 6 - iter 891/992 - loss 0.03098304 - time (sec): 52.87 - samples/sec: 2800.80 - lr: 0.000014 - momentum: 0.000000
2023-10-14 01:12:14,742 epoch 6 - iter 990/992 - loss 0.03115872 - time (sec): 58.66 - samples/sec: 2790.74 - lr: 0.000013 - momentum: 0.000000
2023-10-14 01:12:14,852 ----------------------------------------------------------------------------------------------------
2023-10-14 01:12:14,852 EPOCH 6 done: loss 0.0311 - lr: 0.000013
2023-10-14 01:12:18,275 DEV : loss 0.1660223752260208 - f1-score (micro avg) 0.7549
2023-10-14 01:12:18,296 saving best model
2023-10-14 01:12:18,723 ----------------------------------------------------------------------------------------------------
2023-10-14 01:12:24,554 epoch 7 - iter 99/992 - loss 0.02003484 - time (sec): 5.83 - samples/sec: 2773.72 - lr: 0.000013 - momentum: 0.000000
2023-10-14 01:12:30,403 epoch 7 - iter 198/992 - loss 0.02219555 - time (sec): 11.68 - samples/sec: 2749.50 - lr: 0.000013 - momentum: 0.000000
2023-10-14 01:12:36,359 epoch 7 - iter 297/992 - loss 0.02044054 - time (sec): 17.63 - samples/sec: 2783.24 - lr: 0.000012 - momentum: 0.000000
2023-10-14 01:12:42,275 epoch 7 - iter 396/992 - loss 0.02115285 - time (sec): 23.55 - samples/sec: 2787.47 - lr: 0.000012 - momentum: 0.000000
2023-10-14 01:12:47,979 epoch 7 - iter 495/992 - loss 0.02111835 - time (sec): 29.25 - samples/sec: 2790.21 - lr: 0.000012 - momentum: 0.000000
2023-10-14 01:12:53,906 epoch 7 - iter 594/992 - loss 0.02231115 - time (sec): 35.18 - samples/sec: 2794.36 - lr: 0.000011 - momentum: 0.000000
2023-10-14 01:12:59,944 epoch 7 - iter 693/992 - loss 0.02257417 - time (sec): 41.22 - samples/sec: 2790.76 - lr: 0.000011 - momentum: 0.000000
2023-10-14 01:13:05,688 epoch 7 - iter 792/992 - loss 0.02390230 - time (sec): 46.96 - samples/sec: 2792.08 - lr: 0.000011 - momentum: 0.000000
2023-10-14 01:13:11,978 epoch 7 - iter 891/992 - loss 0.02331735 - time (sec): 53.25 - samples/sec: 2769.06 - lr: 0.000010 - momentum: 0.000000
2023-10-14 01:13:17,717 epoch 7 - iter 990/992 - loss 0.02276912 - time (sec): 58.99 - samples/sec: 2774.18 - lr: 0.000010 - momentum: 0.000000
2023-10-14 01:13:17,823 ----------------------------------------------------------------------------------------------------
2023-10-14 01:13:17,823 EPOCH 7 done: loss 0.0228 - lr: 0.000010
2023-10-14 01:13:21,216 DEV : loss 0.19811701774597168 - f1-score (micro avg) 0.7519
2023-10-14 01:13:21,240 ----------------------------------------------------------------------------------------------------
2023-10-14 01:13:26,990 epoch 8 - iter 99/992 - loss 0.01904585 - time (sec): 5.75 - samples/sec: 2988.62 - lr: 0.000010 - momentum: 0.000000
2023-10-14 01:13:32,631 epoch 8 - iter 198/992 - loss 0.01503292 - time (sec): 11.39 - samples/sec: 2919.52 - lr: 0.000009 - momentum: 0.000000
2023-10-14 01:13:38,169 epoch 8 - iter 297/992 - loss 0.01613248 - time (sec): 16.93 - samples/sec: 2884.32 - lr: 0.000009 - momentum: 0.000000
2023-10-14 01:13:44,251 epoch 8 - iter 396/992 - loss 0.01565979 - time (sec): 23.01 - samples/sec: 2873.38 - lr: 0.000009 - momentum: 0.000000
2023-10-14 01:13:50,113 epoch 8 - iter 495/992 - loss 0.01515949 - time (sec): 28.87 - samples/sec: 2875.24 - lr: 0.000008 - momentum: 0.000000
2023-10-14 01:13:56,010 epoch 8 - iter 594/992 - loss 0.01545050 - time (sec): 34.77 - samples/sec: 2873.67 - lr: 0.000008 - momentum: 0.000000
2023-10-14 01:14:01,470 epoch 8 - iter 693/992 - loss 0.01547792 - time (sec): 40.23 - samples/sec: 2886.91 - lr: 0.000008 - momentum: 0.000000
2023-10-14 01:14:07,035 epoch 8 - iter 792/992 - loss 0.01620138 - time (sec): 45.79 - samples/sec: 2880.33 - lr: 0.000007 - momentum: 0.000000
2023-10-14 01:14:12,743 epoch 8 - iter 891/992 - loss 0.01643518 - time (sec): 51.50 - samples/sec: 2871.73 - lr: 0.000007 - momentum: 0.000000
2023-10-14 01:14:18,495 epoch 8 - iter 990/992 - loss 0.01691683 - time (sec): 57.25 - samples/sec: 2860.38 - lr: 0.000007 - momentum: 0.000000
2023-10-14 01:14:18,596 ----------------------------------------------------------------------------------------------------
2023-10-14 01:14:18,596 EPOCH 8 done: loss 0.0169 - lr: 0.000007
2023-10-14 01:14:22,033 DEV : loss 0.2040073573589325 - f1-score (micro avg) 0.7532
2023-10-14 01:14:22,053 ----------------------------------------------------------------------------------------------------
2023-10-14 01:14:27,795 epoch 9 - iter 99/992 - loss 0.01058698 - time (sec): 5.74 - samples/sec: 2818.02 - lr: 0.000006 - momentum: 0.000000
2023-10-14 01:14:33,771 epoch 9 - iter 198/992 - loss 0.01076997 - time (sec): 11.72 - samples/sec: 2835.15 - lr: 0.000006 - momentum: 0.000000
2023-10-14 01:14:39,862 epoch 9 - iter 297/992 - loss 0.01232413 - time (sec): 17.81 - samples/sec: 2809.95 - lr: 0.000006 - momentum: 0.000000
2023-10-14 01:14:45,579 epoch 9 - iter 396/992 - loss 0.01181418 - time (sec): 23.52 - samples/sec: 2791.33 - lr: 0.000005 - momentum: 0.000000
2023-10-14 01:14:51,506 epoch 9 - iter 495/992 - loss 0.01146147 - time (sec): 29.45 - samples/sec: 2793.73 - lr: 0.000005 - momentum: 0.000000
2023-10-14 01:14:57,263 epoch 9 - iter 594/992 - loss 0.01205104 - time (sec): 35.21 - samples/sec: 2802.96 - lr: 0.000005 - momentum: 0.000000
2023-10-14 01:15:03,302 epoch 9 - iter 693/992 - loss 0.01212528 - time (sec): 41.25 - samples/sec: 2782.13 - lr: 0.000004 - momentum: 0.000000
2023-10-14 01:15:09,297 epoch 9 - iter 792/992 - loss 0.01203877 - time (sec): 47.24 - samples/sec: 2788.87 - lr: 0.000004 - momentum: 0.000000
2023-10-14 01:15:14,909 epoch 9 - iter 891/992 - loss 0.01226298 - time (sec): 52.85 - samples/sec: 2793.28 - lr: 0.000004 - momentum: 0.000000
2023-10-14 01:15:20,595 epoch 9 - iter 990/992 - loss 0.01257149 - time (sec): 58.54 - samples/sec: 2793.66 - lr: 0.000003 - momentum: 0.000000
2023-10-14 01:15:20,749 ----------------------------------------------------------------------------------------------------
2023-10-14 01:15:20,750 EPOCH 9 done: loss 0.0125 - lr: 0.000003
2023-10-14 01:15:24,734 DEV : loss 0.20826229453086853 - f1-score (micro avg) 0.7574
2023-10-14 01:15:24,754 saving best model
2023-10-14 01:15:25,268 ----------------------------------------------------------------------------------------------------
2023-10-14 01:15:31,404 epoch 10 - iter 99/992 - loss 0.00769297 - time (sec): 6.13 - samples/sec: 2865.70 - lr: 0.000003 - momentum: 0.000000
2023-10-14 01:15:37,404 epoch 10 - iter 198/992 - loss 0.00697383 - time (sec): 12.13 - samples/sec: 2809.85 - lr: 0.000003 - momentum: 0.000000
2023-10-14 01:15:42,992 epoch 10 - iter 297/992 - loss 0.00764415 - time (sec): 17.72 - samples/sec: 2775.21 - lr: 0.000002 - momentum: 0.000000
2023-10-14 01:15:48,942 epoch 10 - iter 396/992 - loss 0.00827819 - time (sec): 23.67 - samples/sec: 2778.87 - lr: 0.000002 - momentum: 0.000000
2023-10-14 01:15:54,828 epoch 10 - iter 495/992 - loss 0.00802192 - time (sec): 29.56 - samples/sec: 2786.17 - lr: 0.000002 - momentum: 0.000000
2023-10-14 01:16:00,688 epoch 10 - iter 594/992 - loss 0.00768562 - time (sec): 35.42 - samples/sec: 2778.12 - lr: 0.000001 - momentum: 0.000000
2023-10-14 01:16:06,479 epoch 10 - iter 693/992 - loss 0.00852248 - time (sec): 41.21 - samples/sec: 2781.76 - lr: 0.000001 - momentum: 0.000000
2023-10-14 01:16:12,477 epoch 10 - iter 792/992 - loss 0.00836690 - time (sec): 47.20 - samples/sec: 2781.41 - lr: 0.000001 - momentum: 0.000000
2023-10-14 01:16:18,131 epoch 10 - iter 891/992 - loss 0.00856003 - time (sec): 52.86 - samples/sec: 2797.47 - lr: 0.000000 - momentum: 0.000000
2023-10-14 01:16:23,891 epoch 10 - iter 990/992 - loss 0.00877170 - time (sec): 58.62 - samples/sec: 2792.44 - lr: 0.000000 - momentum: 0.000000
2023-10-14 01:16:23,999 ----------------------------------------------------------------------------------------------------
2023-10-14 01:16:24,000 EPOCH 10 done: loss 0.0088 - lr: 0.000000
2023-10-14 01:16:27,448 DEV : loss 0.21987785398960114 - f1-score (micro avg) 0.753
2023-10-14 01:16:27,906 ----------------------------------------------------------------------------------------------------
2023-10-14 01:16:27,907 Loading model from best epoch ...
2023-10-14 01:16:29,242 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-14 01:16:32,497
Results:
- F-score (micro) 0.7723
- F-score (macro) 0.6898
- Accuracy 0.6513
By class:
precision recall f1-score support
LOC 0.8118 0.8427 0.8270 655
PER 0.7379 0.8206 0.7771 223
ORG 0.4831 0.4488 0.4653 127
micro avg 0.7572 0.7881 0.7723 1005
macro avg 0.6776 0.7041 0.6898 1005
weighted avg 0.7538 0.7881 0.7702 1005
2023-10-14 01:16:32,497 ----------------------------------------------------------------------------------------------------