2023-10-17 16:52:18,091 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:52:18,092 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 16:52:18,092 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:52:18,092 MultiCorpus: 7142 train + 698 dev + 2570 test sentences - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator 2023-10-17 16:52:18,092 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:52:18,092 Train: 7142 sentences 2023-10-17 16:52:18,092 (train_with_dev=False, train_with_test=False) 2023-10-17 16:52:18,092 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:52:18,092 Training Params: 2023-10-17 16:52:18,092 - learning_rate: "5e-05" 2023-10-17 16:52:18,092 - mini_batch_size: "4" 2023-10-17 16:52:18,092 - max_epochs: "10" 2023-10-17 16:52:18,092 - shuffle: "True" 2023-10-17 16:52:18,092 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:52:18,092 Plugins: 2023-10-17 16:52:18,093 - TensorboardLogger 2023-10-17 16:52:18,093 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 16:52:18,093 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:52:18,093 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 16:52:18,093 - metric: "('micro avg', 'f1-score')" 2023-10-17 16:52:18,093 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:52:18,093 Computation: 2023-10-17 16:52:18,093 - compute on device: cuda:0 2023-10-17 16:52:18,093 - embedding storage: none 2023-10-17 16:52:18,093 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:52:18,093 Model training base path: "hmbench-newseye/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-17 16:52:18,093 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:52:18,093 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:52:18,093 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 16:52:27,358 epoch 1 - iter 178/1786 - loss 2.21871897 - time (sec): 9.26 - samples/sec: 2878.90 - lr: 0.000005 - momentum: 0.000000 2023-10-17 16:52:36,409 epoch 1 - iter 356/1786 - loss 1.38296011 - time (sec): 18.31 - samples/sec: 2845.66 - lr: 0.000010 - momentum: 0.000000 2023-10-17 16:52:45,400 epoch 1 - iter 534/1786 - loss 1.05615250 - time (sec): 27.31 - samples/sec: 2783.47 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:52:54,087 epoch 1 - iter 712/1786 - loss 0.85769517 - time (sec): 35.99 - samples/sec: 2784.06 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:53:02,572 epoch 1 - iter 890/1786 - loss 0.73275737 - time (sec): 44.48 - samples/sec: 2790.69 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:53:11,664 epoch 1 - iter 1068/1786 - loss 0.63820486 - time (sec): 53.57 - samples/sec: 2796.92 - lr: 0.000030 - momentum: 0.000000 2023-10-17 16:53:20,466 epoch 1 - iter 1246/1786 - loss 0.56890668 - time (sec): 62.37 - samples/sec: 2799.26 - lr: 0.000035 - momentum: 0.000000 2023-10-17 16:53:29,472 epoch 1 - iter 1424/1786 - loss 0.51716606 - time (sec): 71.38 - samples/sec: 2802.99 - lr: 0.000040 - momentum: 0.000000 2023-10-17 16:53:38,038 epoch 1 - iter 1602/1786 - loss 0.48029995 - time (sec): 79.94 - samples/sec: 2793.74 - lr: 0.000045 - momentum: 0.000000 2023-10-17 16:53:46,960 epoch 1 - iter 1780/1786 - loss 0.44806609 - time (sec): 88.87 - samples/sec: 2792.18 - lr: 0.000050 - momentum: 0.000000 2023-10-17 16:53:47,217 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:53:47,217 EPOCH 1 done: loss 0.4473 - lr: 0.000050 2023-10-17 16:53:50,565 DEV : loss 0.13324852287769318 - f1-score (micro avg) 0.7523 2023-10-17 16:53:50,581 saving best model 2023-10-17 16:53:50,908 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:54:00,065 epoch 2 - iter 178/1786 - loss 0.13770242 - time (sec): 9.16 - samples/sec: 2870.67 - lr: 0.000049 - momentum: 0.000000 2023-10-17 16:54:09,059 epoch 2 - iter 356/1786 - loss 0.13827377 - time (sec): 18.15 - samples/sec: 2864.91 - lr: 0.000049 - momentum: 0.000000 2023-10-17 16:54:17,988 epoch 2 - iter 534/1786 - loss 0.13360174 - time (sec): 27.08 - samples/sec: 2907.68 - lr: 0.000048 - momentum: 0.000000 2023-10-17 16:54:27,211 epoch 2 - iter 712/1786 - loss 0.12801144 - time (sec): 36.30 - samples/sec: 2833.67 - lr: 0.000048 - momentum: 0.000000 2023-10-17 16:54:36,104 epoch 2 - iter 890/1786 - loss 0.12858850 - time (sec): 45.20 - samples/sec: 2831.78 - lr: 0.000047 - momentum: 0.000000 2023-10-17 16:54:44,902 epoch 2 - iter 1068/1786 - loss 0.12567109 - time (sec): 53.99 - samples/sec: 2824.77 - lr: 0.000047 - momentum: 0.000000 2023-10-17 16:54:53,158 epoch 2 - iter 1246/1786 - loss 0.12596144 - time (sec): 62.25 - samples/sec: 2802.84 - lr: 0.000046 - momentum: 0.000000 2023-10-17 16:55:01,552 epoch 2 - iter 1424/1786 - loss 0.12696890 - time (sec): 70.64 - samples/sec: 2806.15 - lr: 0.000046 - momentum: 0.000000 2023-10-17 16:55:10,194 epoch 2 - iter 1602/1786 - loss 0.12429806 - time (sec): 79.29 - samples/sec: 2823.60 - lr: 0.000045 - momentum: 0.000000 2023-10-17 16:55:18,590 epoch 2 - iter 1780/1786 - loss 0.12508328 - time (sec): 87.68 - samples/sec: 2825.63 - lr: 0.000044 - momentum: 0.000000 2023-10-17 16:55:18,875 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:55:18,875 EPOCH 2 done: loss 0.1249 - lr: 0.000044 2023-10-17 16:55:23,260 DEV : loss 0.13685545325279236 - f1-score (micro avg) 0.7641 2023-10-17 16:55:23,284 saving best model 2023-10-17 16:55:23,802 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:55:33,412 epoch 3 - iter 178/1786 - loss 0.08647078 - time (sec): 9.61 - samples/sec: 2576.96 - lr: 0.000044 - momentum: 0.000000 2023-10-17 16:55:42,204 epoch 3 - iter 356/1786 - loss 0.08617597 - time (sec): 18.40 - samples/sec: 2580.69 - lr: 0.000043 - momentum: 0.000000 2023-10-17 16:55:51,071 epoch 3 - iter 534/1786 - loss 0.08424719 - time (sec): 27.27 - samples/sec: 2641.65 - lr: 0.000043 - momentum: 0.000000 2023-10-17 16:55:59,868 epoch 3 - iter 712/1786 - loss 0.08587144 - time (sec): 36.06 - samples/sec: 2659.75 - lr: 0.000042 - momentum: 0.000000 2023-10-17 16:56:08,770 epoch 3 - iter 890/1786 - loss 0.08794462 - time (sec): 44.97 - samples/sec: 2682.94 - lr: 0.000042 - momentum: 0.000000 2023-10-17 16:56:17,535 epoch 3 - iter 1068/1786 - loss 0.08815726 - time (sec): 53.73 - samples/sec: 2688.62 - lr: 0.000041 - momentum: 0.000000 2023-10-17 16:56:26,777 epoch 3 - iter 1246/1786 - loss 0.08905607 - time (sec): 62.97 - samples/sec: 2729.34 - lr: 0.000041 - momentum: 0.000000 2023-10-17 16:56:35,913 epoch 3 - iter 1424/1786 - loss 0.08723379 - time (sec): 72.11 - samples/sec: 2753.83 - lr: 0.000040 - momentum: 0.000000 2023-10-17 16:56:44,974 epoch 3 - iter 1602/1786 - loss 0.08706683 - time (sec): 81.17 - samples/sec: 2762.42 - lr: 0.000039 - momentum: 0.000000 2023-10-17 16:56:54,035 epoch 3 - iter 1780/1786 - loss 0.08825144 - time (sec): 90.23 - samples/sec: 2750.82 - lr: 0.000039 - momentum: 0.000000 2023-10-17 16:56:54,315 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:56:54,315 EPOCH 3 done: loss 0.0882 - lr: 0.000039 2023-10-17 16:56:59,077 DEV : loss 0.14459823071956635 - f1-score (micro avg) 0.7584 2023-10-17 16:56:59,106 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:57:08,002 epoch 4 - iter 178/1786 - loss 0.06433659 - time (sec): 8.89 - samples/sec: 2562.16 - lr: 0.000038 - momentum: 0.000000 2023-10-17 16:57:17,044 epoch 4 - iter 356/1786 - loss 0.06363777 - time (sec): 17.94 - samples/sec: 2718.43 - lr: 0.000038 - momentum: 0.000000 2023-10-17 16:57:26,135 epoch 4 - iter 534/1786 - loss 0.06827523 - time (sec): 27.03 - samples/sec: 2719.40 - lr: 0.000037 - momentum: 0.000000 2023-10-17 16:57:35,404 epoch 4 - iter 712/1786 - loss 0.06514736 - time (sec): 36.30 - samples/sec: 2728.35 - lr: 0.000037 - momentum: 0.000000 2023-10-17 16:57:45,569 epoch 4 - iter 890/1786 - loss 0.06336587 - time (sec): 46.46 - samples/sec: 2657.00 - lr: 0.000036 - momentum: 0.000000 2023-10-17 16:57:55,140 epoch 4 - iter 1068/1786 - loss 0.06381780 - time (sec): 56.03 - samples/sec: 2651.17 - lr: 0.000036 - momentum: 0.000000 2023-10-17 16:58:04,219 epoch 4 - iter 1246/1786 - loss 0.06361061 - time (sec): 65.11 - samples/sec: 2676.83 - lr: 0.000035 - momentum: 0.000000 2023-10-17 16:58:13,784 epoch 4 - iter 1424/1786 - loss 0.06257745 - time (sec): 74.68 - samples/sec: 2658.04 - lr: 0.000034 - momentum: 0.000000 2023-10-17 16:58:22,720 epoch 4 - iter 1602/1786 - loss 0.06274981 - time (sec): 83.61 - samples/sec: 2671.70 - lr: 0.000034 - momentum: 0.000000 2023-10-17 16:58:31,629 epoch 4 - iter 1780/1786 - loss 0.06404258 - time (sec): 92.52 - samples/sec: 2678.43 - lr: 0.000033 - momentum: 0.000000 2023-10-17 16:58:31,931 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:58:31,931 EPOCH 4 done: loss 0.0645 - lr: 0.000033 2023-10-17 16:58:36,305 DEV : loss 0.19311568140983582 - f1-score (micro avg) 0.805 2023-10-17 16:58:36,325 saving best model 2023-10-17 16:58:36,922 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:58:46,869 epoch 5 - iter 178/1786 - loss 0.04260479 - time (sec): 9.94 - samples/sec: 2357.89 - lr: 0.000033 - momentum: 0.000000 2023-10-17 16:58:57,159 epoch 5 - iter 356/1786 - loss 0.04983370 - time (sec): 20.24 - samples/sec: 2454.69 - lr: 0.000032 - momentum: 0.000000 2023-10-17 16:59:07,160 epoch 5 - iter 534/1786 - loss 0.04828306 - time (sec): 30.24 - samples/sec: 2471.15 - lr: 0.000032 - momentum: 0.000000 2023-10-17 16:59:17,235 epoch 5 - iter 712/1786 - loss 0.04893202 - time (sec): 40.31 - samples/sec: 2496.01 - lr: 0.000031 - momentum: 0.000000 2023-10-17 16:59:26,584 epoch 5 - iter 890/1786 - loss 0.04637195 - time (sec): 49.66 - samples/sec: 2501.21 - lr: 0.000031 - momentum: 0.000000 2023-10-17 16:59:36,704 epoch 5 - iter 1068/1786 - loss 0.04625644 - time (sec): 59.78 - samples/sec: 2473.47 - lr: 0.000030 - momentum: 0.000000 2023-10-17 16:59:45,822 epoch 5 - iter 1246/1786 - loss 0.04898382 - time (sec): 68.90 - samples/sec: 2511.42 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:59:54,824 epoch 5 - iter 1424/1786 - loss 0.04759295 - time (sec): 77.90 - samples/sec: 2544.97 - lr: 0.000029 - momentum: 0.000000 2023-10-17 17:00:03,625 epoch 5 - iter 1602/1786 - loss 0.04680758 - time (sec): 86.70 - samples/sec: 2587.51 - lr: 0.000028 - momentum: 0.000000 2023-10-17 17:00:12,295 epoch 5 - iter 1780/1786 - loss 0.04743130 - time (sec): 95.37 - samples/sec: 2597.21 - lr: 0.000028 - momentum: 0.000000 2023-10-17 17:00:12,588 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:00:12,589 EPOCH 5 done: loss 0.0474 - lr: 0.000028 2023-10-17 17:00:16,651 DEV : loss 0.17322644591331482 - f1-score (micro avg) 0.7981 2023-10-17 17:00:16,668 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:00:25,622 epoch 6 - iter 178/1786 - loss 0.03367132 - time (sec): 8.95 - samples/sec: 2649.14 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:00:34,756 epoch 6 - iter 356/1786 - loss 0.02946445 - time (sec): 18.09 - samples/sec: 2716.66 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:00:43,640 epoch 6 - iter 534/1786 - loss 0.03288777 - time (sec): 26.97 - samples/sec: 2698.65 - lr: 0.000026 - momentum: 0.000000 2023-10-17 17:00:52,601 epoch 6 - iter 712/1786 - loss 0.03447713 - time (sec): 35.93 - samples/sec: 2741.62 - lr: 0.000026 - momentum: 0.000000 2023-10-17 17:01:01,792 epoch 6 - iter 890/1786 - loss 0.03420441 - time (sec): 45.12 - samples/sec: 2753.07 - lr: 0.000025 - momentum: 0.000000 2023-10-17 17:01:10,808 epoch 6 - iter 1068/1786 - loss 0.03406023 - time (sec): 54.14 - samples/sec: 2754.14 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:01:19,758 epoch 6 - iter 1246/1786 - loss 0.03617254 - time (sec): 63.09 - samples/sec: 2758.50 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:01:28,690 epoch 6 - iter 1424/1786 - loss 0.03612102 - time (sec): 72.02 - samples/sec: 2735.54 - lr: 0.000023 - momentum: 0.000000 2023-10-17 17:01:37,772 epoch 6 - iter 1602/1786 - loss 0.03765794 - time (sec): 81.10 - samples/sec: 2746.72 - lr: 0.000023 - momentum: 0.000000 2023-10-17 17:01:46,880 epoch 6 - iter 1780/1786 - loss 0.03667936 - time (sec): 90.21 - samples/sec: 2751.60 - lr: 0.000022 - momentum: 0.000000 2023-10-17 17:01:47,163 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:01:47,163 EPOCH 6 done: loss 0.0366 - lr: 0.000022 2023-10-17 17:01:52,645 DEV : loss 0.15160760283470154 - f1-score (micro avg) 0.8108 2023-10-17 17:01:52,669 saving best model 2023-10-17 17:01:53,166 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:02:02,034 epoch 7 - iter 178/1786 - loss 0.02243616 - time (sec): 8.87 - samples/sec: 2670.58 - lr: 0.000022 - momentum: 0.000000 2023-10-17 17:02:10,842 epoch 7 - iter 356/1786 - loss 0.02990473 - time (sec): 17.67 - samples/sec: 2698.56 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:02:19,800 epoch 7 - iter 534/1786 - loss 0.02790768 - time (sec): 26.63 - samples/sec: 2745.16 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:02:28,678 epoch 7 - iter 712/1786 - loss 0.02686670 - time (sec): 35.51 - samples/sec: 2767.72 - lr: 0.000020 - momentum: 0.000000 2023-10-17 17:02:37,705 epoch 7 - iter 890/1786 - loss 0.02818929 - time (sec): 44.54 - samples/sec: 2794.28 - lr: 0.000019 - momentum: 0.000000 2023-10-17 17:02:46,743 epoch 7 - iter 1068/1786 - loss 0.02833242 - time (sec): 53.58 - samples/sec: 2778.54 - lr: 0.000019 - momentum: 0.000000 2023-10-17 17:02:55,461 epoch 7 - iter 1246/1786 - loss 0.02713084 - time (sec): 62.29 - samples/sec: 2786.84 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:03:04,492 epoch 7 - iter 1424/1786 - loss 0.02636015 - time (sec): 71.32 - samples/sec: 2794.61 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:03:13,874 epoch 7 - iter 1602/1786 - loss 0.02528908 - time (sec): 80.71 - samples/sec: 2787.60 - lr: 0.000017 - momentum: 0.000000 2023-10-17 17:03:22,759 epoch 7 - iter 1780/1786 - loss 0.02472222 - time (sec): 89.59 - samples/sec: 2770.71 - lr: 0.000017 - momentum: 0.000000 2023-10-17 17:03:23,041 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:03:23,041 EPOCH 7 done: loss 0.0247 - lr: 0.000017 2023-10-17 17:03:27,251 DEV : loss 0.19103629887104034 - f1-score (micro avg) 0.8067 2023-10-17 17:03:27,268 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:03:36,570 epoch 8 - iter 178/1786 - loss 0.01260488 - time (sec): 9.30 - samples/sec: 2824.02 - lr: 0.000016 - momentum: 0.000000 2023-10-17 17:03:45,688 epoch 8 - iter 356/1786 - loss 0.01459491 - time (sec): 18.42 - samples/sec: 2767.29 - lr: 0.000016 - momentum: 0.000000 2023-10-17 17:03:54,755 epoch 8 - iter 534/1786 - loss 0.01672564 - time (sec): 27.49 - samples/sec: 2767.12 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:04:03,758 epoch 8 - iter 712/1786 - loss 0.01689015 - time (sec): 36.49 - samples/sec: 2761.92 - lr: 0.000014 - momentum: 0.000000 2023-10-17 17:04:13,230 epoch 8 - iter 890/1786 - loss 0.01743950 - time (sec): 45.96 - samples/sec: 2746.11 - lr: 0.000014 - momentum: 0.000000 2023-10-17 17:04:22,645 epoch 8 - iter 1068/1786 - loss 0.01652408 - time (sec): 55.38 - samples/sec: 2739.98 - lr: 0.000013 - momentum: 0.000000 2023-10-17 17:04:32,047 epoch 8 - iter 1246/1786 - loss 0.01619837 - time (sec): 64.78 - samples/sec: 2731.32 - lr: 0.000013 - momentum: 0.000000 2023-10-17 17:04:40,986 epoch 8 - iter 1424/1786 - loss 0.01653528 - time (sec): 73.72 - samples/sec: 2702.23 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:04:50,044 epoch 8 - iter 1602/1786 - loss 0.01772735 - time (sec): 82.77 - samples/sec: 2690.22 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:04:59,741 epoch 8 - iter 1780/1786 - loss 0.01789757 - time (sec): 92.47 - samples/sec: 2680.76 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:05:00,037 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:05:00,037 EPOCH 8 done: loss 0.0178 - lr: 0.000011 2023-10-17 17:05:04,178 DEV : loss 0.19288192689418793 - f1-score (micro avg) 0.8164 2023-10-17 17:05:04,196 saving best model 2023-10-17 17:05:04,690 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:05:13,676 epoch 9 - iter 178/1786 - loss 0.01009022 - time (sec): 8.98 - samples/sec: 2756.42 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:05:22,516 epoch 9 - iter 356/1786 - loss 0.00986010 - time (sec): 17.82 - samples/sec: 2738.92 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:05:31,550 epoch 9 - iter 534/1786 - loss 0.00972978 - time (sec): 26.86 - samples/sec: 2715.07 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:05:40,522 epoch 9 - iter 712/1786 - loss 0.01005396 - time (sec): 35.83 - samples/sec: 2724.34 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:05:49,410 epoch 9 - iter 890/1786 - loss 0.00974288 - time (sec): 44.72 - samples/sec: 2707.57 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:05:58,027 epoch 9 - iter 1068/1786 - loss 0.01028181 - time (sec): 53.34 - samples/sec: 2735.43 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:06:06,447 epoch 9 - iter 1246/1786 - loss 0.01054176 - time (sec): 61.76 - samples/sec: 2746.07 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:06:15,508 epoch 9 - iter 1424/1786 - loss 0.01094470 - time (sec): 70.82 - samples/sec: 2794.60 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:06:24,863 epoch 9 - iter 1602/1786 - loss 0.01150858 - time (sec): 80.17 - samples/sec: 2785.58 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:06:33,903 epoch 9 - iter 1780/1786 - loss 0.01116154 - time (sec): 89.21 - samples/sec: 2777.51 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:06:34,220 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:06:34,220 EPOCH 9 done: loss 0.0112 - lr: 0.000006 2023-10-17 17:06:38,448 DEV : loss 0.20903374254703522 - f1-score (micro avg) 0.8159 2023-10-17 17:06:38,465 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:06:47,597 epoch 10 - iter 178/1786 - loss 0.00568008 - time (sec): 9.13 - samples/sec: 2739.71 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:06:56,476 epoch 10 - iter 356/1786 - loss 0.00720514 - time (sec): 18.01 - samples/sec: 2737.90 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:07:06,276 epoch 10 - iter 534/1786 - loss 0.00805828 - time (sec): 27.81 - samples/sec: 2667.98 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:07:15,509 epoch 10 - iter 712/1786 - loss 0.00769778 - time (sec): 37.04 - samples/sec: 2677.42 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:07:24,581 epoch 10 - iter 890/1786 - loss 0.00701365 - time (sec): 46.11 - samples/sec: 2705.04 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:07:33,235 epoch 10 - iter 1068/1786 - loss 0.00703323 - time (sec): 54.77 - samples/sec: 2724.05 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:07:42,565 epoch 10 - iter 1246/1786 - loss 0.00729937 - time (sec): 64.10 - samples/sec: 2767.76 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:07:51,492 epoch 10 - iter 1424/1786 - loss 0.00712708 - time (sec): 73.03 - samples/sec: 2778.64 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:07:59,964 epoch 10 - iter 1602/1786 - loss 0.00771465 - time (sec): 81.50 - samples/sec: 2769.03 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:08:08,709 epoch 10 - iter 1780/1786 - loss 0.00761523 - time (sec): 90.24 - samples/sec: 2746.66 - lr: 0.000000 - momentum: 0.000000 2023-10-17 17:08:08,994 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:08:08,995 EPOCH 10 done: loss 0.0076 - lr: 0.000000 2023-10-17 17:08:13,185 DEV : loss 0.22629648447036743 - f1-score (micro avg) 0.8182 2023-10-17 17:08:13,202 saving best model 2023-10-17 17:08:14,229 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:08:14,231 Loading model from best epoch ... 2023-10-17 17:08:15,809 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 17:08:25,470 Results: - F-score (micro) 0.7015 - F-score (macro) 0.6396 - Accuracy 0.555 By class: precision recall f1-score support LOC 0.6994 0.7032 0.7013 1095 PER 0.7741 0.7688 0.7714 1012 ORG 0.5105 0.5462 0.5277 357 HumanProd 0.4528 0.7273 0.5581 33 micro avg 0.6954 0.7076 0.7015 2497 macro avg 0.6092 0.6864 0.6396 2497 weighted avg 0.6994 0.7076 0.7030 2497 2023-10-17 17:08:25,470 ----------------------------------------------------------------------------------------------------