2023-10-17 15:52:31,932 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:52:31,933 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 15:52:31,933 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:52:31,933 MultiCorpus: 7142 train + 698 dev + 2570 test sentences - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator 2023-10-17 15:52:31,933 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:52:31,933 Train: 7142 sentences 2023-10-17 15:52:31,934 (train_with_dev=False, train_with_test=False) 2023-10-17 15:52:31,934 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:52:31,934 Training Params: 2023-10-17 15:52:31,934 - learning_rate: "5e-05" 2023-10-17 15:52:31,934 - mini_batch_size: "4" 2023-10-17 15:52:31,934 - max_epochs: "10" 2023-10-17 15:52:31,934 - shuffle: "True" 2023-10-17 15:52:31,934 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:52:31,934 Plugins: 2023-10-17 15:52:31,934 - TensorboardLogger 2023-10-17 15:52:31,934 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 15:52:31,934 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:52:31,934 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 15:52:31,934 - metric: "('micro avg', 'f1-score')" 2023-10-17 15:52:31,934 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:52:31,934 Computation: 2023-10-17 15:52:31,934 - compute on device: cuda:0 2023-10-17 15:52:31,934 - embedding storage: none 2023-10-17 15:52:31,934 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:52:31,934 Model training base path: "hmbench-newseye/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-17 15:52:31,934 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:52:31,934 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:52:31,934 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 15:52:41,382 epoch 1 - iter 178/1786 - loss 2.34367862 - time (sec): 9.45 - samples/sec: 2678.64 - lr: 0.000005 - momentum: 0.000000 2023-10-17 15:52:50,231 epoch 1 - iter 356/1786 - loss 1.45266562 - time (sec): 18.30 - samples/sec: 2757.85 - lr: 0.000010 - momentum: 0.000000 2023-10-17 15:52:58,887 epoch 1 - iter 534/1786 - loss 1.09361339 - time (sec): 26.95 - samples/sec: 2753.10 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:53:07,495 epoch 1 - iter 712/1786 - loss 0.88980840 - time (sec): 35.56 - samples/sec: 2755.02 - lr: 0.000020 - momentum: 0.000000 2023-10-17 15:53:16,282 epoch 1 - iter 890/1786 - loss 0.75657457 - time (sec): 44.35 - samples/sec: 2730.28 - lr: 0.000025 - momentum: 0.000000 2023-10-17 15:53:24,937 epoch 1 - iter 1068/1786 - loss 0.66057703 - time (sec): 53.00 - samples/sec: 2749.27 - lr: 0.000030 - momentum: 0.000000 2023-10-17 15:53:33,748 epoch 1 - iter 1246/1786 - loss 0.59119045 - time (sec): 61.81 - samples/sec: 2757.19 - lr: 0.000035 - momentum: 0.000000 2023-10-17 15:53:42,598 epoch 1 - iter 1424/1786 - loss 0.53087097 - time (sec): 70.66 - samples/sec: 2786.60 - lr: 0.000040 - momentum: 0.000000 2023-10-17 15:53:51,598 epoch 1 - iter 1602/1786 - loss 0.48612856 - time (sec): 79.66 - samples/sec: 2801.21 - lr: 0.000045 - momentum: 0.000000 2023-10-17 15:54:00,393 epoch 1 - iter 1780/1786 - loss 0.45402159 - time (sec): 88.46 - samples/sec: 2805.78 - lr: 0.000050 - momentum: 0.000000 2023-10-17 15:54:00,645 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:54:00,645 EPOCH 1 done: loss 0.4532 - lr: 0.000050 2023-10-17 15:54:03,411 DEV : loss 0.1179049089550972 - f1-score (micro avg) 0.7143 2023-10-17 15:54:03,428 saving best model 2023-10-17 15:54:03,836 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:54:12,853 epoch 2 - iter 178/1786 - loss 0.14248804 - time (sec): 9.02 - samples/sec: 3071.74 - lr: 0.000049 - momentum: 0.000000 2023-10-17 15:54:21,514 epoch 2 - iter 356/1786 - loss 0.13559186 - time (sec): 17.68 - samples/sec: 2924.14 - lr: 0.000049 - momentum: 0.000000 2023-10-17 15:54:30,388 epoch 2 - iter 534/1786 - loss 0.13301377 - time (sec): 26.55 - samples/sec: 2880.33 - lr: 0.000048 - momentum: 0.000000 2023-10-17 15:54:39,321 epoch 2 - iter 712/1786 - loss 0.13307675 - time (sec): 35.48 - samples/sec: 2830.18 - lr: 0.000048 - momentum: 0.000000 2023-10-17 15:54:48,107 epoch 2 - iter 890/1786 - loss 0.13518362 - time (sec): 44.27 - samples/sec: 2795.59 - lr: 0.000047 - momentum: 0.000000 2023-10-17 15:54:56,864 epoch 2 - iter 1068/1786 - loss 0.13571605 - time (sec): 53.03 - samples/sec: 2785.25 - lr: 0.000047 - momentum: 0.000000 2023-10-17 15:55:05,785 epoch 2 - iter 1246/1786 - loss 0.13436108 - time (sec): 61.95 - samples/sec: 2775.13 - lr: 0.000046 - momentum: 0.000000 2023-10-17 15:55:15,560 epoch 2 - iter 1424/1786 - loss 0.13261596 - time (sec): 71.72 - samples/sec: 2761.75 - lr: 0.000046 - momentum: 0.000000 2023-10-17 15:55:24,336 epoch 2 - iter 1602/1786 - loss 0.13050720 - time (sec): 80.50 - samples/sec: 2762.84 - lr: 0.000045 - momentum: 0.000000 2023-10-17 15:55:33,886 epoch 2 - iter 1780/1786 - loss 0.12881272 - time (sec): 90.05 - samples/sec: 2757.02 - lr: 0.000044 - momentum: 0.000000 2023-10-17 15:55:34,158 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:55:34,159 EPOCH 2 done: loss 0.1290 - lr: 0.000044 2023-10-17 15:55:38,485 DEV : loss 0.1279807686805725 - f1-score (micro avg) 0.771 2023-10-17 15:55:38,506 saving best model 2023-10-17 15:55:39,049 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:55:47,780 epoch 3 - iter 178/1786 - loss 0.09685860 - time (sec): 8.73 - samples/sec: 2786.19 - lr: 0.000044 - momentum: 0.000000 2023-10-17 15:55:56,581 epoch 3 - iter 356/1786 - loss 0.08761037 - time (sec): 17.53 - samples/sec: 2841.07 - lr: 0.000043 - momentum: 0.000000 2023-10-17 15:56:05,831 epoch 3 - iter 534/1786 - loss 0.08657818 - time (sec): 26.78 - samples/sec: 2886.82 - lr: 0.000043 - momentum: 0.000000 2023-10-17 15:56:14,768 epoch 3 - iter 712/1786 - loss 0.08368218 - time (sec): 35.72 - samples/sec: 2854.17 - lr: 0.000042 - momentum: 0.000000 2023-10-17 15:56:23,797 epoch 3 - iter 890/1786 - loss 0.08955065 - time (sec): 44.75 - samples/sec: 2865.47 - lr: 0.000042 - momentum: 0.000000 2023-10-17 15:56:32,871 epoch 3 - iter 1068/1786 - loss 0.09043215 - time (sec): 53.82 - samples/sec: 2796.37 - lr: 0.000041 - momentum: 0.000000 2023-10-17 15:56:42,648 epoch 3 - iter 1246/1786 - loss 0.08937320 - time (sec): 63.60 - samples/sec: 2731.12 - lr: 0.000041 - momentum: 0.000000 2023-10-17 15:56:52,145 epoch 3 - iter 1424/1786 - loss 0.08925155 - time (sec): 73.09 - samples/sec: 2725.98 - lr: 0.000040 - momentum: 0.000000 2023-10-17 15:57:01,535 epoch 3 - iter 1602/1786 - loss 0.08899510 - time (sec): 82.48 - samples/sec: 2728.03 - lr: 0.000039 - momentum: 0.000000 2023-10-17 15:57:10,421 epoch 3 - iter 1780/1786 - loss 0.08771303 - time (sec): 91.37 - samples/sec: 2713.68 - lr: 0.000039 - momentum: 0.000000 2023-10-17 15:57:10,724 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:57:10,724 EPOCH 3 done: loss 0.0878 - lr: 0.000039 2023-10-17 15:57:15,642 DEV : loss 0.14156535267829895 - f1-score (micro avg) 0.8008 2023-10-17 15:57:15,659 saving best model 2023-10-17 15:57:16,163 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:57:25,653 epoch 4 - iter 178/1786 - loss 0.07429879 - time (sec): 9.49 - samples/sec: 2701.82 - lr: 0.000038 - momentum: 0.000000 2023-10-17 15:57:34,755 epoch 4 - iter 356/1786 - loss 0.06430215 - time (sec): 18.59 - samples/sec: 2733.12 - lr: 0.000038 - momentum: 0.000000 2023-10-17 15:57:43,760 epoch 4 - iter 534/1786 - loss 0.06406374 - time (sec): 27.59 - samples/sec: 2746.44 - lr: 0.000037 - momentum: 0.000000 2023-10-17 15:57:52,967 epoch 4 - iter 712/1786 - loss 0.06410664 - time (sec): 36.80 - samples/sec: 2702.68 - lr: 0.000037 - momentum: 0.000000 2023-10-17 15:58:01,889 epoch 4 - iter 890/1786 - loss 0.06504755 - time (sec): 45.72 - samples/sec: 2702.00 - lr: 0.000036 - momentum: 0.000000 2023-10-17 15:58:10,570 epoch 4 - iter 1068/1786 - loss 0.06510199 - time (sec): 54.40 - samples/sec: 2713.58 - lr: 0.000036 - momentum: 0.000000 2023-10-17 15:58:19,502 epoch 4 - iter 1246/1786 - loss 0.06435600 - time (sec): 63.34 - samples/sec: 2736.48 - lr: 0.000035 - momentum: 0.000000 2023-10-17 15:58:28,372 epoch 4 - iter 1424/1786 - loss 0.06458328 - time (sec): 72.21 - samples/sec: 2746.20 - lr: 0.000034 - momentum: 0.000000 2023-10-17 15:58:37,436 epoch 4 - iter 1602/1786 - loss 0.06468852 - time (sec): 81.27 - samples/sec: 2751.42 - lr: 0.000034 - momentum: 0.000000 2023-10-17 15:58:46,379 epoch 4 - iter 1780/1786 - loss 0.06609802 - time (sec): 90.21 - samples/sec: 2747.15 - lr: 0.000033 - momentum: 0.000000 2023-10-17 15:58:46,710 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:58:46,711 EPOCH 4 done: loss 0.0662 - lr: 0.000033 2023-10-17 15:58:51,103 DEV : loss 0.15699328482151031 - f1-score (micro avg) 0.7646 2023-10-17 15:58:51,123 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:59:00,412 epoch 5 - iter 178/1786 - loss 0.04586062 - time (sec): 9.29 - samples/sec: 2686.32 - lr: 0.000033 - momentum: 0.000000 2023-10-17 15:59:09,251 epoch 5 - iter 356/1786 - loss 0.04291853 - time (sec): 18.13 - samples/sec: 2704.13 - lr: 0.000032 - momentum: 0.000000 2023-10-17 15:59:18,399 epoch 5 - iter 534/1786 - loss 0.04747939 - time (sec): 27.27 - samples/sec: 2695.65 - lr: 0.000032 - momentum: 0.000000 2023-10-17 15:59:27,224 epoch 5 - iter 712/1786 - loss 0.04918069 - time (sec): 36.10 - samples/sec: 2684.48 - lr: 0.000031 - momentum: 0.000000 2023-10-17 15:59:36,569 epoch 5 - iter 890/1786 - loss 0.04784106 - time (sec): 45.44 - samples/sec: 2669.77 - lr: 0.000031 - momentum: 0.000000 2023-10-17 15:59:45,755 epoch 5 - iter 1068/1786 - loss 0.04772355 - time (sec): 54.63 - samples/sec: 2682.24 - lr: 0.000030 - momentum: 0.000000 2023-10-17 15:59:54,755 epoch 5 - iter 1246/1786 - loss 0.04737999 - time (sec): 63.63 - samples/sec: 2703.26 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:00:03,645 epoch 5 - iter 1424/1786 - loss 0.04771482 - time (sec): 72.52 - samples/sec: 2718.78 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:00:12,703 epoch 5 - iter 1602/1786 - loss 0.04810478 - time (sec): 81.58 - samples/sec: 2736.47 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:00:21,665 epoch 5 - iter 1780/1786 - loss 0.04647691 - time (sec): 90.54 - samples/sec: 2741.01 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:00:21,939 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:00:21,939 EPOCH 5 done: loss 0.0465 - lr: 0.000028 2023-10-17 16:00:26,974 DEV : loss 0.20135752856731415 - f1-score (micro avg) 0.8131 2023-10-17 16:00:26,998 saving best model 2023-10-17 16:00:27,492 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:00:36,901 epoch 6 - iter 178/1786 - loss 0.02371619 - time (sec): 9.41 - samples/sec: 2639.52 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:00:45,622 epoch 6 - iter 356/1786 - loss 0.02704307 - time (sec): 18.13 - samples/sec: 2637.58 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:00:54,994 epoch 6 - iter 534/1786 - loss 0.02894294 - time (sec): 27.50 - samples/sec: 2658.13 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:01:04,286 epoch 6 - iter 712/1786 - loss 0.03110961 - time (sec): 36.79 - samples/sec: 2677.96 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:01:13,832 epoch 6 - iter 890/1786 - loss 0.03368810 - time (sec): 46.34 - samples/sec: 2660.27 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:01:23,709 epoch 6 - iter 1068/1786 - loss 0.03458587 - time (sec): 56.21 - samples/sec: 2659.74 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:01:32,950 epoch 6 - iter 1246/1786 - loss 0.03418798 - time (sec): 65.46 - samples/sec: 2669.61 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:01:41,737 epoch 6 - iter 1424/1786 - loss 0.03451417 - time (sec): 74.24 - samples/sec: 2687.45 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:01:50,597 epoch 6 - iter 1602/1786 - loss 0.03509028 - time (sec): 83.10 - samples/sec: 2698.62 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:01:59,482 epoch 6 - iter 1780/1786 - loss 0.03614456 - time (sec): 91.99 - samples/sec: 2696.20 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:01:59,765 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:01:59,766 EPOCH 6 done: loss 0.0362 - lr: 0.000022 2023-10-17 16:02:04,017 DEV : loss 0.1925005316734314 - f1-score (micro avg) 0.8075 2023-10-17 16:02:04,044 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:02:13,664 epoch 7 - iter 178/1786 - loss 0.02284920 - time (sec): 9.62 - samples/sec: 2717.79 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:02:22,697 epoch 7 - iter 356/1786 - loss 0.02740520 - time (sec): 18.65 - samples/sec: 2714.50 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:02:31,859 epoch 7 - iter 534/1786 - loss 0.02590493 - time (sec): 27.81 - samples/sec: 2665.99 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:02:41,484 epoch 7 - iter 712/1786 - loss 0.02755967 - time (sec): 37.44 - samples/sec: 2657.64 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:02:50,972 epoch 7 - iter 890/1786 - loss 0.02563833 - time (sec): 46.93 - samples/sec: 2650.69 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:02:59,513 epoch 7 - iter 1068/1786 - loss 0.02699974 - time (sec): 55.47 - samples/sec: 2687.52 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:03:07,993 epoch 7 - iter 1246/1786 - loss 0.02737097 - time (sec): 63.95 - samples/sec: 2710.35 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:03:16,653 epoch 7 - iter 1424/1786 - loss 0.02693964 - time (sec): 72.61 - samples/sec: 2716.12 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:03:26,354 epoch 7 - iter 1602/1786 - loss 0.02616714 - time (sec): 82.31 - samples/sec: 2710.40 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:03:35,289 epoch 7 - iter 1780/1786 - loss 0.02619563 - time (sec): 91.24 - samples/sec: 2720.76 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:03:35,580 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:03:35,581 EPOCH 7 done: loss 0.0262 - lr: 0.000017 2023-10-17 16:03:39,757 DEV : loss 0.20547646284103394 - f1-score (micro avg) 0.8095 2023-10-17 16:03:39,775 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:03:48,334 epoch 8 - iter 178/1786 - loss 0.01592414 - time (sec): 8.56 - samples/sec: 2809.60 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:03:56,788 epoch 8 - iter 356/1786 - loss 0.01794367 - time (sec): 17.01 - samples/sec: 2820.07 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:04:05,481 epoch 8 - iter 534/1786 - loss 0.01674718 - time (sec): 25.71 - samples/sec: 2849.79 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:04:14,287 epoch 8 - iter 712/1786 - loss 0.01669581 - time (sec): 34.51 - samples/sec: 2877.25 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:04:23,559 epoch 8 - iter 890/1786 - loss 0.01792091 - time (sec): 43.78 - samples/sec: 2892.21 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:04:32,216 epoch 8 - iter 1068/1786 - loss 0.01671891 - time (sec): 52.44 - samples/sec: 2902.58 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:04:41,111 epoch 8 - iter 1246/1786 - loss 0.01747172 - time (sec): 61.34 - samples/sec: 2884.28 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:04:49,915 epoch 8 - iter 1424/1786 - loss 0.01727856 - time (sec): 70.14 - samples/sec: 2859.80 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:04:58,460 epoch 8 - iter 1602/1786 - loss 0.01739919 - time (sec): 78.68 - samples/sec: 2854.52 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:05:07,380 epoch 8 - iter 1780/1786 - loss 0.01717924 - time (sec): 87.60 - samples/sec: 2828.17 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:05:07,703 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:05:07,703 EPOCH 8 done: loss 0.0172 - lr: 0.000011 2023-10-17 16:05:12,475 DEV : loss 0.21377256512641907 - f1-score (micro avg) 0.8076 2023-10-17 16:05:12,491 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:05:21,378 epoch 9 - iter 178/1786 - loss 0.00826584 - time (sec): 8.89 - samples/sec: 2711.80 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:05:30,318 epoch 9 - iter 356/1786 - loss 0.01226686 - time (sec): 17.83 - samples/sec: 2697.41 - lr: 0.000010 - momentum: 0.000000 2023-10-17 16:05:39,675 epoch 9 - iter 534/1786 - loss 0.01157243 - time (sec): 27.18 - samples/sec: 2673.41 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:05:49,071 epoch 9 - iter 712/1786 - loss 0.01175115 - time (sec): 36.58 - samples/sec: 2647.63 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:05:58,453 epoch 9 - iter 890/1786 - loss 0.01181223 - time (sec): 45.96 - samples/sec: 2676.32 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:06:07,562 epoch 9 - iter 1068/1786 - loss 0.01203776 - time (sec): 55.07 - samples/sec: 2701.84 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:06:16,564 epoch 9 - iter 1246/1786 - loss 0.01231963 - time (sec): 64.07 - samples/sec: 2694.92 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:06:25,422 epoch 9 - iter 1424/1786 - loss 0.01181941 - time (sec): 72.93 - samples/sec: 2710.53 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:06:34,321 epoch 9 - iter 1602/1786 - loss 0.01147011 - time (sec): 81.83 - samples/sec: 2717.80 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:06:43,578 epoch 9 - iter 1780/1786 - loss 0.01110474 - time (sec): 91.09 - samples/sec: 2720.88 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:06:43,868 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:06:43,868 EPOCH 9 done: loss 0.0111 - lr: 0.000006 2023-10-17 16:06:48,061 DEV : loss 0.23628994822502136 - f1-score (micro avg) 0.8019 2023-10-17 16:06:48,079 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:06:57,034 epoch 10 - iter 178/1786 - loss 0.01008703 - time (sec): 8.95 - samples/sec: 2703.96 - lr: 0.000005 - momentum: 0.000000 2023-10-17 16:07:05,834 epoch 10 - iter 356/1786 - loss 0.00789102 - time (sec): 17.75 - samples/sec: 2724.66 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:07:15,085 epoch 10 - iter 534/1786 - loss 0.00797454 - time (sec): 27.00 - samples/sec: 2742.49 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:07:24,045 epoch 10 - iter 712/1786 - loss 0.00890035 - time (sec): 35.96 - samples/sec: 2694.39 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:07:32,988 epoch 10 - iter 890/1786 - loss 0.00798521 - time (sec): 44.91 - samples/sec: 2676.59 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:07:42,125 epoch 10 - iter 1068/1786 - loss 0.00826627 - time (sec): 54.04 - samples/sec: 2689.69 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:07:51,256 epoch 10 - iter 1246/1786 - loss 0.00780579 - time (sec): 63.18 - samples/sec: 2701.71 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:08:00,277 epoch 10 - iter 1424/1786 - loss 0.00770846 - time (sec): 72.20 - samples/sec: 2703.76 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:08:09,290 epoch 10 - iter 1602/1786 - loss 0.00704214 - time (sec): 81.21 - samples/sec: 2715.74 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:08:18,427 epoch 10 - iter 1780/1786 - loss 0.00728962 - time (sec): 90.35 - samples/sec: 2745.08 - lr: 0.000000 - momentum: 0.000000 2023-10-17 16:08:18,757 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:08:18,758 EPOCH 10 done: loss 0.0073 - lr: 0.000000 2023-10-17 16:08:24,207 DEV : loss 0.23320543766021729 - f1-score (micro avg) 0.8119 2023-10-17 16:08:24,758 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:08:24,760 Loading model from best epoch ... 2023-10-17 16:08:26,405 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 16:08:38,013 Results: - F-score (micro) 0.7026 - F-score (macro) 0.6436 - Accuracy 0.557 By class: precision recall f1-score support LOC 0.6675 0.7534 0.7079 1095 PER 0.7524 0.7628 0.7576 1012 ORG 0.5516 0.5238 0.5374 357 HumanProd 0.5405 0.6061 0.5714 33 micro avg 0.6839 0.7225 0.7026 2497 macro avg 0.6280 0.6615 0.6436 2497 weighted avg 0.6837 0.7225 0.7018 2497 2023-10-17 16:08:38,013 ----------------------------------------------------------------------------------------------------