stefan-it's picture
Upload folder using huggingface_hub
d7daea1
2023-10-17 15:52:31,932 ----------------------------------------------------------------------------------------------------
2023-10-17 15:52:31,933 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 15:52:31,933 ----------------------------------------------------------------------------------------------------
2023-10-17 15:52:31,933 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-17 15:52:31,933 ----------------------------------------------------------------------------------------------------
2023-10-17 15:52:31,933 Train: 7142 sentences
2023-10-17 15:52:31,934 (train_with_dev=False, train_with_test=False)
2023-10-17 15:52:31,934 ----------------------------------------------------------------------------------------------------
2023-10-17 15:52:31,934 Training Params:
2023-10-17 15:52:31,934 - learning_rate: "5e-05"
2023-10-17 15:52:31,934 - mini_batch_size: "4"
2023-10-17 15:52:31,934 - max_epochs: "10"
2023-10-17 15:52:31,934 - shuffle: "True"
2023-10-17 15:52:31,934 ----------------------------------------------------------------------------------------------------
2023-10-17 15:52:31,934 Plugins:
2023-10-17 15:52:31,934 - TensorboardLogger
2023-10-17 15:52:31,934 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 15:52:31,934 ----------------------------------------------------------------------------------------------------
2023-10-17 15:52:31,934 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 15:52:31,934 - metric: "('micro avg', 'f1-score')"
2023-10-17 15:52:31,934 ----------------------------------------------------------------------------------------------------
2023-10-17 15:52:31,934 Computation:
2023-10-17 15:52:31,934 - compute on device: cuda:0
2023-10-17 15:52:31,934 - embedding storage: none
2023-10-17 15:52:31,934 ----------------------------------------------------------------------------------------------------
2023-10-17 15:52:31,934 Model training base path: "hmbench-newseye/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 15:52:31,934 ----------------------------------------------------------------------------------------------------
2023-10-17 15:52:31,934 ----------------------------------------------------------------------------------------------------
2023-10-17 15:52:31,934 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 15:52:41,382 epoch 1 - iter 178/1786 - loss 2.34367862 - time (sec): 9.45 - samples/sec: 2678.64 - lr: 0.000005 - momentum: 0.000000
2023-10-17 15:52:50,231 epoch 1 - iter 356/1786 - loss 1.45266562 - time (sec): 18.30 - samples/sec: 2757.85 - lr: 0.000010 - momentum: 0.000000
2023-10-17 15:52:58,887 epoch 1 - iter 534/1786 - loss 1.09361339 - time (sec): 26.95 - samples/sec: 2753.10 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:53:07,495 epoch 1 - iter 712/1786 - loss 0.88980840 - time (sec): 35.56 - samples/sec: 2755.02 - lr: 0.000020 - momentum: 0.000000
2023-10-17 15:53:16,282 epoch 1 - iter 890/1786 - loss 0.75657457 - time (sec): 44.35 - samples/sec: 2730.28 - lr: 0.000025 - momentum: 0.000000
2023-10-17 15:53:24,937 epoch 1 - iter 1068/1786 - loss 0.66057703 - time (sec): 53.00 - samples/sec: 2749.27 - lr: 0.000030 - momentum: 0.000000
2023-10-17 15:53:33,748 epoch 1 - iter 1246/1786 - loss 0.59119045 - time (sec): 61.81 - samples/sec: 2757.19 - lr: 0.000035 - momentum: 0.000000
2023-10-17 15:53:42,598 epoch 1 - iter 1424/1786 - loss 0.53087097 - time (sec): 70.66 - samples/sec: 2786.60 - lr: 0.000040 - momentum: 0.000000
2023-10-17 15:53:51,598 epoch 1 - iter 1602/1786 - loss 0.48612856 - time (sec): 79.66 - samples/sec: 2801.21 - lr: 0.000045 - momentum: 0.000000
2023-10-17 15:54:00,393 epoch 1 - iter 1780/1786 - loss 0.45402159 - time (sec): 88.46 - samples/sec: 2805.78 - lr: 0.000050 - momentum: 0.000000
2023-10-17 15:54:00,645 ----------------------------------------------------------------------------------------------------
2023-10-17 15:54:00,645 EPOCH 1 done: loss 0.4532 - lr: 0.000050
2023-10-17 15:54:03,411 DEV : loss 0.1179049089550972 - f1-score (micro avg) 0.7143
2023-10-17 15:54:03,428 saving best model
2023-10-17 15:54:03,836 ----------------------------------------------------------------------------------------------------
2023-10-17 15:54:12,853 epoch 2 - iter 178/1786 - loss 0.14248804 - time (sec): 9.02 - samples/sec: 3071.74 - lr: 0.000049 - momentum: 0.000000
2023-10-17 15:54:21,514 epoch 2 - iter 356/1786 - loss 0.13559186 - time (sec): 17.68 - samples/sec: 2924.14 - lr: 0.000049 - momentum: 0.000000
2023-10-17 15:54:30,388 epoch 2 - iter 534/1786 - loss 0.13301377 - time (sec): 26.55 - samples/sec: 2880.33 - lr: 0.000048 - momentum: 0.000000
2023-10-17 15:54:39,321 epoch 2 - iter 712/1786 - loss 0.13307675 - time (sec): 35.48 - samples/sec: 2830.18 - lr: 0.000048 - momentum: 0.000000
2023-10-17 15:54:48,107 epoch 2 - iter 890/1786 - loss 0.13518362 - time (sec): 44.27 - samples/sec: 2795.59 - lr: 0.000047 - momentum: 0.000000
2023-10-17 15:54:56,864 epoch 2 - iter 1068/1786 - loss 0.13571605 - time (sec): 53.03 - samples/sec: 2785.25 - lr: 0.000047 - momentum: 0.000000
2023-10-17 15:55:05,785 epoch 2 - iter 1246/1786 - loss 0.13436108 - time (sec): 61.95 - samples/sec: 2775.13 - lr: 0.000046 - momentum: 0.000000
2023-10-17 15:55:15,560 epoch 2 - iter 1424/1786 - loss 0.13261596 - time (sec): 71.72 - samples/sec: 2761.75 - lr: 0.000046 - momentum: 0.000000
2023-10-17 15:55:24,336 epoch 2 - iter 1602/1786 - loss 0.13050720 - time (sec): 80.50 - samples/sec: 2762.84 - lr: 0.000045 - momentum: 0.000000
2023-10-17 15:55:33,886 epoch 2 - iter 1780/1786 - loss 0.12881272 - time (sec): 90.05 - samples/sec: 2757.02 - lr: 0.000044 - momentum: 0.000000
2023-10-17 15:55:34,158 ----------------------------------------------------------------------------------------------------
2023-10-17 15:55:34,159 EPOCH 2 done: loss 0.1290 - lr: 0.000044
2023-10-17 15:55:38,485 DEV : loss 0.1279807686805725 - f1-score (micro avg) 0.771
2023-10-17 15:55:38,506 saving best model
2023-10-17 15:55:39,049 ----------------------------------------------------------------------------------------------------
2023-10-17 15:55:47,780 epoch 3 - iter 178/1786 - loss 0.09685860 - time (sec): 8.73 - samples/sec: 2786.19 - lr: 0.000044 - momentum: 0.000000
2023-10-17 15:55:56,581 epoch 3 - iter 356/1786 - loss 0.08761037 - time (sec): 17.53 - samples/sec: 2841.07 - lr: 0.000043 - momentum: 0.000000
2023-10-17 15:56:05,831 epoch 3 - iter 534/1786 - loss 0.08657818 - time (sec): 26.78 - samples/sec: 2886.82 - lr: 0.000043 - momentum: 0.000000
2023-10-17 15:56:14,768 epoch 3 - iter 712/1786 - loss 0.08368218 - time (sec): 35.72 - samples/sec: 2854.17 - lr: 0.000042 - momentum: 0.000000
2023-10-17 15:56:23,797 epoch 3 - iter 890/1786 - loss 0.08955065 - time (sec): 44.75 - samples/sec: 2865.47 - lr: 0.000042 - momentum: 0.000000
2023-10-17 15:56:32,871 epoch 3 - iter 1068/1786 - loss 0.09043215 - time (sec): 53.82 - samples/sec: 2796.37 - lr: 0.000041 - momentum: 0.000000
2023-10-17 15:56:42,648 epoch 3 - iter 1246/1786 - loss 0.08937320 - time (sec): 63.60 - samples/sec: 2731.12 - lr: 0.000041 - momentum: 0.000000
2023-10-17 15:56:52,145 epoch 3 - iter 1424/1786 - loss 0.08925155 - time (sec): 73.09 - samples/sec: 2725.98 - lr: 0.000040 - momentum: 0.000000
2023-10-17 15:57:01,535 epoch 3 - iter 1602/1786 - loss 0.08899510 - time (sec): 82.48 - samples/sec: 2728.03 - lr: 0.000039 - momentum: 0.000000
2023-10-17 15:57:10,421 epoch 3 - iter 1780/1786 - loss 0.08771303 - time (sec): 91.37 - samples/sec: 2713.68 - lr: 0.000039 - momentum: 0.000000
2023-10-17 15:57:10,724 ----------------------------------------------------------------------------------------------------
2023-10-17 15:57:10,724 EPOCH 3 done: loss 0.0878 - lr: 0.000039
2023-10-17 15:57:15,642 DEV : loss 0.14156535267829895 - f1-score (micro avg) 0.8008
2023-10-17 15:57:15,659 saving best model
2023-10-17 15:57:16,163 ----------------------------------------------------------------------------------------------------
2023-10-17 15:57:25,653 epoch 4 - iter 178/1786 - loss 0.07429879 - time (sec): 9.49 - samples/sec: 2701.82 - lr: 0.000038 - momentum: 0.000000
2023-10-17 15:57:34,755 epoch 4 - iter 356/1786 - loss 0.06430215 - time (sec): 18.59 - samples/sec: 2733.12 - lr: 0.000038 - momentum: 0.000000
2023-10-17 15:57:43,760 epoch 4 - iter 534/1786 - loss 0.06406374 - time (sec): 27.59 - samples/sec: 2746.44 - lr: 0.000037 - momentum: 0.000000
2023-10-17 15:57:52,967 epoch 4 - iter 712/1786 - loss 0.06410664 - time (sec): 36.80 - samples/sec: 2702.68 - lr: 0.000037 - momentum: 0.000000
2023-10-17 15:58:01,889 epoch 4 - iter 890/1786 - loss 0.06504755 - time (sec): 45.72 - samples/sec: 2702.00 - lr: 0.000036 - momentum: 0.000000
2023-10-17 15:58:10,570 epoch 4 - iter 1068/1786 - loss 0.06510199 - time (sec): 54.40 - samples/sec: 2713.58 - lr: 0.000036 - momentum: 0.000000
2023-10-17 15:58:19,502 epoch 4 - iter 1246/1786 - loss 0.06435600 - time (sec): 63.34 - samples/sec: 2736.48 - lr: 0.000035 - momentum: 0.000000
2023-10-17 15:58:28,372 epoch 4 - iter 1424/1786 - loss 0.06458328 - time (sec): 72.21 - samples/sec: 2746.20 - lr: 0.000034 - momentum: 0.000000
2023-10-17 15:58:37,436 epoch 4 - iter 1602/1786 - loss 0.06468852 - time (sec): 81.27 - samples/sec: 2751.42 - lr: 0.000034 - momentum: 0.000000
2023-10-17 15:58:46,379 epoch 4 - iter 1780/1786 - loss 0.06609802 - time (sec): 90.21 - samples/sec: 2747.15 - lr: 0.000033 - momentum: 0.000000
2023-10-17 15:58:46,710 ----------------------------------------------------------------------------------------------------
2023-10-17 15:58:46,711 EPOCH 4 done: loss 0.0662 - lr: 0.000033
2023-10-17 15:58:51,103 DEV : loss 0.15699328482151031 - f1-score (micro avg) 0.7646
2023-10-17 15:58:51,123 ----------------------------------------------------------------------------------------------------
2023-10-17 15:59:00,412 epoch 5 - iter 178/1786 - loss 0.04586062 - time (sec): 9.29 - samples/sec: 2686.32 - lr: 0.000033 - momentum: 0.000000
2023-10-17 15:59:09,251 epoch 5 - iter 356/1786 - loss 0.04291853 - time (sec): 18.13 - samples/sec: 2704.13 - lr: 0.000032 - momentum: 0.000000
2023-10-17 15:59:18,399 epoch 5 - iter 534/1786 - loss 0.04747939 - time (sec): 27.27 - samples/sec: 2695.65 - lr: 0.000032 - momentum: 0.000000
2023-10-17 15:59:27,224 epoch 5 - iter 712/1786 - loss 0.04918069 - time (sec): 36.10 - samples/sec: 2684.48 - lr: 0.000031 - momentum: 0.000000
2023-10-17 15:59:36,569 epoch 5 - iter 890/1786 - loss 0.04784106 - time (sec): 45.44 - samples/sec: 2669.77 - lr: 0.000031 - momentum: 0.000000
2023-10-17 15:59:45,755 epoch 5 - iter 1068/1786 - loss 0.04772355 - time (sec): 54.63 - samples/sec: 2682.24 - lr: 0.000030 - momentum: 0.000000
2023-10-17 15:59:54,755 epoch 5 - iter 1246/1786 - loss 0.04737999 - time (sec): 63.63 - samples/sec: 2703.26 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:00:03,645 epoch 5 - iter 1424/1786 - loss 0.04771482 - time (sec): 72.52 - samples/sec: 2718.78 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:00:12,703 epoch 5 - iter 1602/1786 - loss 0.04810478 - time (sec): 81.58 - samples/sec: 2736.47 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:00:21,665 epoch 5 - iter 1780/1786 - loss 0.04647691 - time (sec): 90.54 - samples/sec: 2741.01 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:00:21,939 ----------------------------------------------------------------------------------------------------
2023-10-17 16:00:21,939 EPOCH 5 done: loss 0.0465 - lr: 0.000028
2023-10-17 16:00:26,974 DEV : loss 0.20135752856731415 - f1-score (micro avg) 0.8131
2023-10-17 16:00:26,998 saving best model
2023-10-17 16:00:27,492 ----------------------------------------------------------------------------------------------------
2023-10-17 16:00:36,901 epoch 6 - iter 178/1786 - loss 0.02371619 - time (sec): 9.41 - samples/sec: 2639.52 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:00:45,622 epoch 6 - iter 356/1786 - loss 0.02704307 - time (sec): 18.13 - samples/sec: 2637.58 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:00:54,994 epoch 6 - iter 534/1786 - loss 0.02894294 - time (sec): 27.50 - samples/sec: 2658.13 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:01:04,286 epoch 6 - iter 712/1786 - loss 0.03110961 - time (sec): 36.79 - samples/sec: 2677.96 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:01:13,832 epoch 6 - iter 890/1786 - loss 0.03368810 - time (sec): 46.34 - samples/sec: 2660.27 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:01:23,709 epoch 6 - iter 1068/1786 - loss 0.03458587 - time (sec): 56.21 - samples/sec: 2659.74 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:01:32,950 epoch 6 - iter 1246/1786 - loss 0.03418798 - time (sec): 65.46 - samples/sec: 2669.61 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:01:41,737 epoch 6 - iter 1424/1786 - loss 0.03451417 - time (sec): 74.24 - samples/sec: 2687.45 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:01:50,597 epoch 6 - iter 1602/1786 - loss 0.03509028 - time (sec): 83.10 - samples/sec: 2698.62 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:01:59,482 epoch 6 - iter 1780/1786 - loss 0.03614456 - time (sec): 91.99 - samples/sec: 2696.20 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:01:59,765 ----------------------------------------------------------------------------------------------------
2023-10-17 16:01:59,766 EPOCH 6 done: loss 0.0362 - lr: 0.000022
2023-10-17 16:02:04,017 DEV : loss 0.1925005316734314 - f1-score (micro avg) 0.8075
2023-10-17 16:02:04,044 ----------------------------------------------------------------------------------------------------
2023-10-17 16:02:13,664 epoch 7 - iter 178/1786 - loss 0.02284920 - time (sec): 9.62 - samples/sec: 2717.79 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:02:22,697 epoch 7 - iter 356/1786 - loss 0.02740520 - time (sec): 18.65 - samples/sec: 2714.50 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:02:31,859 epoch 7 - iter 534/1786 - loss 0.02590493 - time (sec): 27.81 - samples/sec: 2665.99 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:02:41,484 epoch 7 - iter 712/1786 - loss 0.02755967 - time (sec): 37.44 - samples/sec: 2657.64 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:02:50,972 epoch 7 - iter 890/1786 - loss 0.02563833 - time (sec): 46.93 - samples/sec: 2650.69 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:02:59,513 epoch 7 - iter 1068/1786 - loss 0.02699974 - time (sec): 55.47 - samples/sec: 2687.52 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:03:07,993 epoch 7 - iter 1246/1786 - loss 0.02737097 - time (sec): 63.95 - samples/sec: 2710.35 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:03:16,653 epoch 7 - iter 1424/1786 - loss 0.02693964 - time (sec): 72.61 - samples/sec: 2716.12 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:03:26,354 epoch 7 - iter 1602/1786 - loss 0.02616714 - time (sec): 82.31 - samples/sec: 2710.40 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:03:35,289 epoch 7 - iter 1780/1786 - loss 0.02619563 - time (sec): 91.24 - samples/sec: 2720.76 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:03:35,580 ----------------------------------------------------------------------------------------------------
2023-10-17 16:03:35,581 EPOCH 7 done: loss 0.0262 - lr: 0.000017
2023-10-17 16:03:39,757 DEV : loss 0.20547646284103394 - f1-score (micro avg) 0.8095
2023-10-17 16:03:39,775 ----------------------------------------------------------------------------------------------------
2023-10-17 16:03:48,334 epoch 8 - iter 178/1786 - loss 0.01592414 - time (sec): 8.56 - samples/sec: 2809.60 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:03:56,788 epoch 8 - iter 356/1786 - loss 0.01794367 - time (sec): 17.01 - samples/sec: 2820.07 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:04:05,481 epoch 8 - iter 534/1786 - loss 0.01674718 - time (sec): 25.71 - samples/sec: 2849.79 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:04:14,287 epoch 8 - iter 712/1786 - loss 0.01669581 - time (sec): 34.51 - samples/sec: 2877.25 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:04:23,559 epoch 8 - iter 890/1786 - loss 0.01792091 - time (sec): 43.78 - samples/sec: 2892.21 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:04:32,216 epoch 8 - iter 1068/1786 - loss 0.01671891 - time (sec): 52.44 - samples/sec: 2902.58 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:04:41,111 epoch 8 - iter 1246/1786 - loss 0.01747172 - time (sec): 61.34 - samples/sec: 2884.28 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:04:49,915 epoch 8 - iter 1424/1786 - loss 0.01727856 - time (sec): 70.14 - samples/sec: 2859.80 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:04:58,460 epoch 8 - iter 1602/1786 - loss 0.01739919 - time (sec): 78.68 - samples/sec: 2854.52 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:05:07,380 epoch 8 - iter 1780/1786 - loss 0.01717924 - time (sec): 87.60 - samples/sec: 2828.17 - lr: 0.000011 - momentum: 0.000000
2023-10-17 16:05:07,703 ----------------------------------------------------------------------------------------------------
2023-10-17 16:05:07,703 EPOCH 8 done: loss 0.0172 - lr: 0.000011
2023-10-17 16:05:12,475 DEV : loss 0.21377256512641907 - f1-score (micro avg) 0.8076
2023-10-17 16:05:12,491 ----------------------------------------------------------------------------------------------------
2023-10-17 16:05:21,378 epoch 9 - iter 178/1786 - loss 0.00826584 - time (sec): 8.89 - samples/sec: 2711.80 - lr: 0.000011 - momentum: 0.000000
2023-10-17 16:05:30,318 epoch 9 - iter 356/1786 - loss 0.01226686 - time (sec): 17.83 - samples/sec: 2697.41 - lr: 0.000010 - momentum: 0.000000
2023-10-17 16:05:39,675 epoch 9 - iter 534/1786 - loss 0.01157243 - time (sec): 27.18 - samples/sec: 2673.41 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:05:49,071 epoch 9 - iter 712/1786 - loss 0.01175115 - time (sec): 36.58 - samples/sec: 2647.63 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:05:58,453 epoch 9 - iter 890/1786 - loss 0.01181223 - time (sec): 45.96 - samples/sec: 2676.32 - lr: 0.000008 - momentum: 0.000000
2023-10-17 16:06:07,562 epoch 9 - iter 1068/1786 - loss 0.01203776 - time (sec): 55.07 - samples/sec: 2701.84 - lr: 0.000008 - momentum: 0.000000
2023-10-17 16:06:16,564 epoch 9 - iter 1246/1786 - loss 0.01231963 - time (sec): 64.07 - samples/sec: 2694.92 - lr: 0.000007 - momentum: 0.000000
2023-10-17 16:06:25,422 epoch 9 - iter 1424/1786 - loss 0.01181941 - time (sec): 72.93 - samples/sec: 2710.53 - lr: 0.000007 - momentum: 0.000000
2023-10-17 16:06:34,321 epoch 9 - iter 1602/1786 - loss 0.01147011 - time (sec): 81.83 - samples/sec: 2717.80 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:06:43,578 epoch 9 - iter 1780/1786 - loss 0.01110474 - time (sec): 91.09 - samples/sec: 2720.88 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:06:43,868 ----------------------------------------------------------------------------------------------------
2023-10-17 16:06:43,868 EPOCH 9 done: loss 0.0111 - lr: 0.000006
2023-10-17 16:06:48,061 DEV : loss 0.23628994822502136 - f1-score (micro avg) 0.8019
2023-10-17 16:06:48,079 ----------------------------------------------------------------------------------------------------
2023-10-17 16:06:57,034 epoch 10 - iter 178/1786 - loss 0.01008703 - time (sec): 8.95 - samples/sec: 2703.96 - lr: 0.000005 - momentum: 0.000000
2023-10-17 16:07:05,834 epoch 10 - iter 356/1786 - loss 0.00789102 - time (sec): 17.75 - samples/sec: 2724.66 - lr: 0.000004 - momentum: 0.000000
2023-10-17 16:07:15,085 epoch 10 - iter 534/1786 - loss 0.00797454 - time (sec): 27.00 - samples/sec: 2742.49 - lr: 0.000004 - momentum: 0.000000
2023-10-17 16:07:24,045 epoch 10 - iter 712/1786 - loss 0.00890035 - time (sec): 35.96 - samples/sec: 2694.39 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:07:32,988 epoch 10 - iter 890/1786 - loss 0.00798521 - time (sec): 44.91 - samples/sec: 2676.59 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:07:42,125 epoch 10 - iter 1068/1786 - loss 0.00826627 - time (sec): 54.04 - samples/sec: 2689.69 - lr: 0.000002 - momentum: 0.000000
2023-10-17 16:07:51,256 epoch 10 - iter 1246/1786 - loss 0.00780579 - time (sec): 63.18 - samples/sec: 2701.71 - lr: 0.000002 - momentum: 0.000000
2023-10-17 16:08:00,277 epoch 10 - iter 1424/1786 - loss 0.00770846 - time (sec): 72.20 - samples/sec: 2703.76 - lr: 0.000001 - momentum: 0.000000
2023-10-17 16:08:09,290 epoch 10 - iter 1602/1786 - loss 0.00704214 - time (sec): 81.21 - samples/sec: 2715.74 - lr: 0.000001 - momentum: 0.000000
2023-10-17 16:08:18,427 epoch 10 - iter 1780/1786 - loss 0.00728962 - time (sec): 90.35 - samples/sec: 2745.08 - lr: 0.000000 - momentum: 0.000000
2023-10-17 16:08:18,757 ----------------------------------------------------------------------------------------------------
2023-10-17 16:08:18,758 EPOCH 10 done: loss 0.0073 - lr: 0.000000
2023-10-17 16:08:24,207 DEV : loss 0.23320543766021729 - f1-score (micro avg) 0.8119
2023-10-17 16:08:24,758 ----------------------------------------------------------------------------------------------------
2023-10-17 16:08:24,760 Loading model from best epoch ...
2023-10-17 16:08:26,405 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 16:08:38,013
Results:
- F-score (micro) 0.7026
- F-score (macro) 0.6436
- Accuracy 0.557
By class:
precision recall f1-score support
LOC 0.6675 0.7534 0.7079 1095
PER 0.7524 0.7628 0.7576 1012
ORG 0.5516 0.5238 0.5374 357
HumanProd 0.5405 0.6061 0.5714 33
micro avg 0.6839 0.7225 0.7026 2497
macro avg 0.6280 0.6615 0.6436 2497
weighted avg 0.6837 0.7225 0.7018 2497
2023-10-17 16:08:38,013 ----------------------------------------------------------------------------------------------------