stefan-it's picture
Upload folder using huggingface_hub
091140b
2023-10-17 16:52:18,091 ----------------------------------------------------------------------------------------------------
2023-10-17 16:52:18,092 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 16:52:18,092 ----------------------------------------------------------------------------------------------------
2023-10-17 16:52:18,092 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-17 16:52:18,092 ----------------------------------------------------------------------------------------------------
2023-10-17 16:52:18,092 Train: 7142 sentences
2023-10-17 16:52:18,092 (train_with_dev=False, train_with_test=False)
2023-10-17 16:52:18,092 ----------------------------------------------------------------------------------------------------
2023-10-17 16:52:18,092 Training Params:
2023-10-17 16:52:18,092 - learning_rate: "5e-05"
2023-10-17 16:52:18,092 - mini_batch_size: "4"
2023-10-17 16:52:18,092 - max_epochs: "10"
2023-10-17 16:52:18,092 - shuffle: "True"
2023-10-17 16:52:18,092 ----------------------------------------------------------------------------------------------------
2023-10-17 16:52:18,092 Plugins:
2023-10-17 16:52:18,093 - TensorboardLogger
2023-10-17 16:52:18,093 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 16:52:18,093 ----------------------------------------------------------------------------------------------------
2023-10-17 16:52:18,093 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 16:52:18,093 - metric: "('micro avg', 'f1-score')"
2023-10-17 16:52:18,093 ----------------------------------------------------------------------------------------------------
2023-10-17 16:52:18,093 Computation:
2023-10-17 16:52:18,093 - compute on device: cuda:0
2023-10-17 16:52:18,093 - embedding storage: none
2023-10-17 16:52:18,093 ----------------------------------------------------------------------------------------------------
2023-10-17 16:52:18,093 Model training base path: "hmbench-newseye/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-17 16:52:18,093 ----------------------------------------------------------------------------------------------------
2023-10-17 16:52:18,093 ----------------------------------------------------------------------------------------------------
2023-10-17 16:52:18,093 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 16:52:27,358 epoch 1 - iter 178/1786 - loss 2.21871897 - time (sec): 9.26 - samples/sec: 2878.90 - lr: 0.000005 - momentum: 0.000000
2023-10-17 16:52:36,409 epoch 1 - iter 356/1786 - loss 1.38296011 - time (sec): 18.31 - samples/sec: 2845.66 - lr: 0.000010 - momentum: 0.000000
2023-10-17 16:52:45,400 epoch 1 - iter 534/1786 - loss 1.05615250 - time (sec): 27.31 - samples/sec: 2783.47 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:52:54,087 epoch 1 - iter 712/1786 - loss 0.85769517 - time (sec): 35.99 - samples/sec: 2784.06 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:53:02,572 epoch 1 - iter 890/1786 - loss 0.73275737 - time (sec): 44.48 - samples/sec: 2790.69 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:53:11,664 epoch 1 - iter 1068/1786 - loss 0.63820486 - time (sec): 53.57 - samples/sec: 2796.92 - lr: 0.000030 - momentum: 0.000000
2023-10-17 16:53:20,466 epoch 1 - iter 1246/1786 - loss 0.56890668 - time (sec): 62.37 - samples/sec: 2799.26 - lr: 0.000035 - momentum: 0.000000
2023-10-17 16:53:29,472 epoch 1 - iter 1424/1786 - loss 0.51716606 - time (sec): 71.38 - samples/sec: 2802.99 - lr: 0.000040 - momentum: 0.000000
2023-10-17 16:53:38,038 epoch 1 - iter 1602/1786 - loss 0.48029995 - time (sec): 79.94 - samples/sec: 2793.74 - lr: 0.000045 - momentum: 0.000000
2023-10-17 16:53:46,960 epoch 1 - iter 1780/1786 - loss 0.44806609 - time (sec): 88.87 - samples/sec: 2792.18 - lr: 0.000050 - momentum: 0.000000
2023-10-17 16:53:47,217 ----------------------------------------------------------------------------------------------------
2023-10-17 16:53:47,217 EPOCH 1 done: loss 0.4473 - lr: 0.000050
2023-10-17 16:53:50,565 DEV : loss 0.13324852287769318 - f1-score (micro avg) 0.7523
2023-10-17 16:53:50,581 saving best model
2023-10-17 16:53:50,908 ----------------------------------------------------------------------------------------------------
2023-10-17 16:54:00,065 epoch 2 - iter 178/1786 - loss 0.13770242 - time (sec): 9.16 - samples/sec: 2870.67 - lr: 0.000049 - momentum: 0.000000
2023-10-17 16:54:09,059 epoch 2 - iter 356/1786 - loss 0.13827377 - time (sec): 18.15 - samples/sec: 2864.91 - lr: 0.000049 - momentum: 0.000000
2023-10-17 16:54:17,988 epoch 2 - iter 534/1786 - loss 0.13360174 - time (sec): 27.08 - samples/sec: 2907.68 - lr: 0.000048 - momentum: 0.000000
2023-10-17 16:54:27,211 epoch 2 - iter 712/1786 - loss 0.12801144 - time (sec): 36.30 - samples/sec: 2833.67 - lr: 0.000048 - momentum: 0.000000
2023-10-17 16:54:36,104 epoch 2 - iter 890/1786 - loss 0.12858850 - time (sec): 45.20 - samples/sec: 2831.78 - lr: 0.000047 - momentum: 0.000000
2023-10-17 16:54:44,902 epoch 2 - iter 1068/1786 - loss 0.12567109 - time (sec): 53.99 - samples/sec: 2824.77 - lr: 0.000047 - momentum: 0.000000
2023-10-17 16:54:53,158 epoch 2 - iter 1246/1786 - loss 0.12596144 - time (sec): 62.25 - samples/sec: 2802.84 - lr: 0.000046 - momentum: 0.000000
2023-10-17 16:55:01,552 epoch 2 - iter 1424/1786 - loss 0.12696890 - time (sec): 70.64 - samples/sec: 2806.15 - lr: 0.000046 - momentum: 0.000000
2023-10-17 16:55:10,194 epoch 2 - iter 1602/1786 - loss 0.12429806 - time (sec): 79.29 - samples/sec: 2823.60 - lr: 0.000045 - momentum: 0.000000
2023-10-17 16:55:18,590 epoch 2 - iter 1780/1786 - loss 0.12508328 - time (sec): 87.68 - samples/sec: 2825.63 - lr: 0.000044 - momentum: 0.000000
2023-10-17 16:55:18,875 ----------------------------------------------------------------------------------------------------
2023-10-17 16:55:18,875 EPOCH 2 done: loss 0.1249 - lr: 0.000044
2023-10-17 16:55:23,260 DEV : loss 0.13685545325279236 - f1-score (micro avg) 0.7641
2023-10-17 16:55:23,284 saving best model
2023-10-17 16:55:23,802 ----------------------------------------------------------------------------------------------------
2023-10-17 16:55:33,412 epoch 3 - iter 178/1786 - loss 0.08647078 - time (sec): 9.61 - samples/sec: 2576.96 - lr: 0.000044 - momentum: 0.000000
2023-10-17 16:55:42,204 epoch 3 - iter 356/1786 - loss 0.08617597 - time (sec): 18.40 - samples/sec: 2580.69 - lr: 0.000043 - momentum: 0.000000
2023-10-17 16:55:51,071 epoch 3 - iter 534/1786 - loss 0.08424719 - time (sec): 27.27 - samples/sec: 2641.65 - lr: 0.000043 - momentum: 0.000000
2023-10-17 16:55:59,868 epoch 3 - iter 712/1786 - loss 0.08587144 - time (sec): 36.06 - samples/sec: 2659.75 - lr: 0.000042 - momentum: 0.000000
2023-10-17 16:56:08,770 epoch 3 - iter 890/1786 - loss 0.08794462 - time (sec): 44.97 - samples/sec: 2682.94 - lr: 0.000042 - momentum: 0.000000
2023-10-17 16:56:17,535 epoch 3 - iter 1068/1786 - loss 0.08815726 - time (sec): 53.73 - samples/sec: 2688.62 - lr: 0.000041 - momentum: 0.000000
2023-10-17 16:56:26,777 epoch 3 - iter 1246/1786 - loss 0.08905607 - time (sec): 62.97 - samples/sec: 2729.34 - lr: 0.000041 - momentum: 0.000000
2023-10-17 16:56:35,913 epoch 3 - iter 1424/1786 - loss 0.08723379 - time (sec): 72.11 - samples/sec: 2753.83 - lr: 0.000040 - momentum: 0.000000
2023-10-17 16:56:44,974 epoch 3 - iter 1602/1786 - loss 0.08706683 - time (sec): 81.17 - samples/sec: 2762.42 - lr: 0.000039 - momentum: 0.000000
2023-10-17 16:56:54,035 epoch 3 - iter 1780/1786 - loss 0.08825144 - time (sec): 90.23 - samples/sec: 2750.82 - lr: 0.000039 - momentum: 0.000000
2023-10-17 16:56:54,315 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:54,315 EPOCH 3 done: loss 0.0882 - lr: 0.000039
2023-10-17 16:56:59,077 DEV : loss 0.14459823071956635 - f1-score (micro avg) 0.7584
2023-10-17 16:56:59,106 ----------------------------------------------------------------------------------------------------
2023-10-17 16:57:08,002 epoch 4 - iter 178/1786 - loss 0.06433659 - time (sec): 8.89 - samples/sec: 2562.16 - lr: 0.000038 - momentum: 0.000000
2023-10-17 16:57:17,044 epoch 4 - iter 356/1786 - loss 0.06363777 - time (sec): 17.94 - samples/sec: 2718.43 - lr: 0.000038 - momentum: 0.000000
2023-10-17 16:57:26,135 epoch 4 - iter 534/1786 - loss 0.06827523 - time (sec): 27.03 - samples/sec: 2719.40 - lr: 0.000037 - momentum: 0.000000
2023-10-17 16:57:35,404 epoch 4 - iter 712/1786 - loss 0.06514736 - time (sec): 36.30 - samples/sec: 2728.35 - lr: 0.000037 - momentum: 0.000000
2023-10-17 16:57:45,569 epoch 4 - iter 890/1786 - loss 0.06336587 - time (sec): 46.46 - samples/sec: 2657.00 - lr: 0.000036 - momentum: 0.000000
2023-10-17 16:57:55,140 epoch 4 - iter 1068/1786 - loss 0.06381780 - time (sec): 56.03 - samples/sec: 2651.17 - lr: 0.000036 - momentum: 0.000000
2023-10-17 16:58:04,219 epoch 4 - iter 1246/1786 - loss 0.06361061 - time (sec): 65.11 - samples/sec: 2676.83 - lr: 0.000035 - momentum: 0.000000
2023-10-17 16:58:13,784 epoch 4 - iter 1424/1786 - loss 0.06257745 - time (sec): 74.68 - samples/sec: 2658.04 - lr: 0.000034 - momentum: 0.000000
2023-10-17 16:58:22,720 epoch 4 - iter 1602/1786 - loss 0.06274981 - time (sec): 83.61 - samples/sec: 2671.70 - lr: 0.000034 - momentum: 0.000000
2023-10-17 16:58:31,629 epoch 4 - iter 1780/1786 - loss 0.06404258 - time (sec): 92.52 - samples/sec: 2678.43 - lr: 0.000033 - momentum: 0.000000
2023-10-17 16:58:31,931 ----------------------------------------------------------------------------------------------------
2023-10-17 16:58:31,931 EPOCH 4 done: loss 0.0645 - lr: 0.000033
2023-10-17 16:58:36,305 DEV : loss 0.19311568140983582 - f1-score (micro avg) 0.805
2023-10-17 16:58:36,325 saving best model
2023-10-17 16:58:36,922 ----------------------------------------------------------------------------------------------------
2023-10-17 16:58:46,869 epoch 5 - iter 178/1786 - loss 0.04260479 - time (sec): 9.94 - samples/sec: 2357.89 - lr: 0.000033 - momentum: 0.000000
2023-10-17 16:58:57,159 epoch 5 - iter 356/1786 - loss 0.04983370 - time (sec): 20.24 - samples/sec: 2454.69 - lr: 0.000032 - momentum: 0.000000
2023-10-17 16:59:07,160 epoch 5 - iter 534/1786 - loss 0.04828306 - time (sec): 30.24 - samples/sec: 2471.15 - lr: 0.000032 - momentum: 0.000000
2023-10-17 16:59:17,235 epoch 5 - iter 712/1786 - loss 0.04893202 - time (sec): 40.31 - samples/sec: 2496.01 - lr: 0.000031 - momentum: 0.000000
2023-10-17 16:59:26,584 epoch 5 - iter 890/1786 - loss 0.04637195 - time (sec): 49.66 - samples/sec: 2501.21 - lr: 0.000031 - momentum: 0.000000
2023-10-17 16:59:36,704 epoch 5 - iter 1068/1786 - loss 0.04625644 - time (sec): 59.78 - samples/sec: 2473.47 - lr: 0.000030 - momentum: 0.000000
2023-10-17 16:59:45,822 epoch 5 - iter 1246/1786 - loss 0.04898382 - time (sec): 68.90 - samples/sec: 2511.42 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:59:54,824 epoch 5 - iter 1424/1786 - loss 0.04759295 - time (sec): 77.90 - samples/sec: 2544.97 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:00:03,625 epoch 5 - iter 1602/1786 - loss 0.04680758 - time (sec): 86.70 - samples/sec: 2587.51 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:00:12,295 epoch 5 - iter 1780/1786 - loss 0.04743130 - time (sec): 95.37 - samples/sec: 2597.21 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:00:12,588 ----------------------------------------------------------------------------------------------------
2023-10-17 17:00:12,589 EPOCH 5 done: loss 0.0474 - lr: 0.000028
2023-10-17 17:00:16,651 DEV : loss 0.17322644591331482 - f1-score (micro avg) 0.7981
2023-10-17 17:00:16,668 ----------------------------------------------------------------------------------------------------
2023-10-17 17:00:25,622 epoch 6 - iter 178/1786 - loss 0.03367132 - time (sec): 8.95 - samples/sec: 2649.14 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:00:34,756 epoch 6 - iter 356/1786 - loss 0.02946445 - time (sec): 18.09 - samples/sec: 2716.66 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:00:43,640 epoch 6 - iter 534/1786 - loss 0.03288777 - time (sec): 26.97 - samples/sec: 2698.65 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:00:52,601 epoch 6 - iter 712/1786 - loss 0.03447713 - time (sec): 35.93 - samples/sec: 2741.62 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:01:01,792 epoch 6 - iter 890/1786 - loss 0.03420441 - time (sec): 45.12 - samples/sec: 2753.07 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:01:10,808 epoch 6 - iter 1068/1786 - loss 0.03406023 - time (sec): 54.14 - samples/sec: 2754.14 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:01:19,758 epoch 6 - iter 1246/1786 - loss 0.03617254 - time (sec): 63.09 - samples/sec: 2758.50 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:01:28,690 epoch 6 - iter 1424/1786 - loss 0.03612102 - time (sec): 72.02 - samples/sec: 2735.54 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:01:37,772 epoch 6 - iter 1602/1786 - loss 0.03765794 - time (sec): 81.10 - samples/sec: 2746.72 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:01:46,880 epoch 6 - iter 1780/1786 - loss 0.03667936 - time (sec): 90.21 - samples/sec: 2751.60 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:01:47,163 ----------------------------------------------------------------------------------------------------
2023-10-17 17:01:47,163 EPOCH 6 done: loss 0.0366 - lr: 0.000022
2023-10-17 17:01:52,645 DEV : loss 0.15160760283470154 - f1-score (micro avg) 0.8108
2023-10-17 17:01:52,669 saving best model
2023-10-17 17:01:53,166 ----------------------------------------------------------------------------------------------------
2023-10-17 17:02:02,034 epoch 7 - iter 178/1786 - loss 0.02243616 - time (sec): 8.87 - samples/sec: 2670.58 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:02:10,842 epoch 7 - iter 356/1786 - loss 0.02990473 - time (sec): 17.67 - samples/sec: 2698.56 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:02:19,800 epoch 7 - iter 534/1786 - loss 0.02790768 - time (sec): 26.63 - samples/sec: 2745.16 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:02:28,678 epoch 7 - iter 712/1786 - loss 0.02686670 - time (sec): 35.51 - samples/sec: 2767.72 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:02:37,705 epoch 7 - iter 890/1786 - loss 0.02818929 - time (sec): 44.54 - samples/sec: 2794.28 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:02:46,743 epoch 7 - iter 1068/1786 - loss 0.02833242 - time (sec): 53.58 - samples/sec: 2778.54 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:02:55,461 epoch 7 - iter 1246/1786 - loss 0.02713084 - time (sec): 62.29 - samples/sec: 2786.84 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:03:04,492 epoch 7 - iter 1424/1786 - loss 0.02636015 - time (sec): 71.32 - samples/sec: 2794.61 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:03:13,874 epoch 7 - iter 1602/1786 - loss 0.02528908 - time (sec): 80.71 - samples/sec: 2787.60 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:03:22,759 epoch 7 - iter 1780/1786 - loss 0.02472222 - time (sec): 89.59 - samples/sec: 2770.71 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:03:23,041 ----------------------------------------------------------------------------------------------------
2023-10-17 17:03:23,041 EPOCH 7 done: loss 0.0247 - lr: 0.000017
2023-10-17 17:03:27,251 DEV : loss 0.19103629887104034 - f1-score (micro avg) 0.8067
2023-10-17 17:03:27,268 ----------------------------------------------------------------------------------------------------
2023-10-17 17:03:36,570 epoch 8 - iter 178/1786 - loss 0.01260488 - time (sec): 9.30 - samples/sec: 2824.02 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:03:45,688 epoch 8 - iter 356/1786 - loss 0.01459491 - time (sec): 18.42 - samples/sec: 2767.29 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:03:54,755 epoch 8 - iter 534/1786 - loss 0.01672564 - time (sec): 27.49 - samples/sec: 2767.12 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:04:03,758 epoch 8 - iter 712/1786 - loss 0.01689015 - time (sec): 36.49 - samples/sec: 2761.92 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:04:13,230 epoch 8 - iter 890/1786 - loss 0.01743950 - time (sec): 45.96 - samples/sec: 2746.11 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:04:22,645 epoch 8 - iter 1068/1786 - loss 0.01652408 - time (sec): 55.38 - samples/sec: 2739.98 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:04:32,047 epoch 8 - iter 1246/1786 - loss 0.01619837 - time (sec): 64.78 - samples/sec: 2731.32 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:04:40,986 epoch 8 - iter 1424/1786 - loss 0.01653528 - time (sec): 73.72 - samples/sec: 2702.23 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:04:50,044 epoch 8 - iter 1602/1786 - loss 0.01772735 - time (sec): 82.77 - samples/sec: 2690.22 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:04:59,741 epoch 8 - iter 1780/1786 - loss 0.01789757 - time (sec): 92.47 - samples/sec: 2680.76 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:05:00,037 ----------------------------------------------------------------------------------------------------
2023-10-17 17:05:00,037 EPOCH 8 done: loss 0.0178 - lr: 0.000011
2023-10-17 17:05:04,178 DEV : loss 0.19288192689418793 - f1-score (micro avg) 0.8164
2023-10-17 17:05:04,196 saving best model
2023-10-17 17:05:04,690 ----------------------------------------------------------------------------------------------------
2023-10-17 17:05:13,676 epoch 9 - iter 178/1786 - loss 0.01009022 - time (sec): 8.98 - samples/sec: 2756.42 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:05:22,516 epoch 9 - iter 356/1786 - loss 0.00986010 - time (sec): 17.82 - samples/sec: 2738.92 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:05:31,550 epoch 9 - iter 534/1786 - loss 0.00972978 - time (sec): 26.86 - samples/sec: 2715.07 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:05:40,522 epoch 9 - iter 712/1786 - loss 0.01005396 - time (sec): 35.83 - samples/sec: 2724.34 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:05:49,410 epoch 9 - iter 890/1786 - loss 0.00974288 - time (sec): 44.72 - samples/sec: 2707.57 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:05:58,027 epoch 9 - iter 1068/1786 - loss 0.01028181 - time (sec): 53.34 - samples/sec: 2735.43 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:06:06,447 epoch 9 - iter 1246/1786 - loss 0.01054176 - time (sec): 61.76 - samples/sec: 2746.07 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:06:15,508 epoch 9 - iter 1424/1786 - loss 0.01094470 - time (sec): 70.82 - samples/sec: 2794.60 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:06:24,863 epoch 9 - iter 1602/1786 - loss 0.01150858 - time (sec): 80.17 - samples/sec: 2785.58 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:06:33,903 epoch 9 - iter 1780/1786 - loss 0.01116154 - time (sec): 89.21 - samples/sec: 2777.51 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:06:34,220 ----------------------------------------------------------------------------------------------------
2023-10-17 17:06:34,220 EPOCH 9 done: loss 0.0112 - lr: 0.000006
2023-10-17 17:06:38,448 DEV : loss 0.20903374254703522 - f1-score (micro avg) 0.8159
2023-10-17 17:06:38,465 ----------------------------------------------------------------------------------------------------
2023-10-17 17:06:47,597 epoch 10 - iter 178/1786 - loss 0.00568008 - time (sec): 9.13 - samples/sec: 2739.71 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:06:56,476 epoch 10 - iter 356/1786 - loss 0.00720514 - time (sec): 18.01 - samples/sec: 2737.90 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:07:06,276 epoch 10 - iter 534/1786 - loss 0.00805828 - time (sec): 27.81 - samples/sec: 2667.98 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:07:15,509 epoch 10 - iter 712/1786 - loss 0.00769778 - time (sec): 37.04 - samples/sec: 2677.42 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:07:24,581 epoch 10 - iter 890/1786 - loss 0.00701365 - time (sec): 46.11 - samples/sec: 2705.04 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:07:33,235 epoch 10 - iter 1068/1786 - loss 0.00703323 - time (sec): 54.77 - samples/sec: 2724.05 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:07:42,565 epoch 10 - iter 1246/1786 - loss 0.00729937 - time (sec): 64.10 - samples/sec: 2767.76 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:07:51,492 epoch 10 - iter 1424/1786 - loss 0.00712708 - time (sec): 73.03 - samples/sec: 2778.64 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:07:59,964 epoch 10 - iter 1602/1786 - loss 0.00771465 - time (sec): 81.50 - samples/sec: 2769.03 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:08:08,709 epoch 10 - iter 1780/1786 - loss 0.00761523 - time (sec): 90.24 - samples/sec: 2746.66 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:08:08,994 ----------------------------------------------------------------------------------------------------
2023-10-17 17:08:08,995 EPOCH 10 done: loss 0.0076 - lr: 0.000000
2023-10-17 17:08:13,185 DEV : loss 0.22629648447036743 - f1-score (micro avg) 0.8182
2023-10-17 17:08:13,202 saving best model
2023-10-17 17:08:14,229 ----------------------------------------------------------------------------------------------------
2023-10-17 17:08:14,231 Loading model from best epoch ...
2023-10-17 17:08:15,809 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 17:08:25,470
Results:
- F-score (micro) 0.7015
- F-score (macro) 0.6396
- Accuracy 0.555
By class:
precision recall f1-score support
LOC 0.6994 0.7032 0.7013 1095
PER 0.7741 0.7688 0.7714 1012
ORG 0.5105 0.5462 0.5277 357
HumanProd 0.4528 0.7273 0.5581 33
micro avg 0.6954 0.7076 0.7015 2497
macro avg 0.6092 0.6864 0.6396 2497
weighted avg 0.6994 0.7076 0.7030 2497
2023-10-17 17:08:25,470 ----------------------------------------------------------------------------------------------------