2023-10-17 20:18:57,380 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:18:57,380 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:18:57,381 MultiCorpus: 1085 train + 148 dev + 364 test sentences - NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator 2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:18:57,381 Train: 1085 sentences 2023-10-17 20:18:57,381 (train_with_dev=False, train_with_test=False) 2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:18:57,381 Training Params: 2023-10-17 20:18:57,381 - learning_rate: "3e-05" 2023-10-17 20:18:57,381 - mini_batch_size: "4" 2023-10-17 20:18:57,381 - max_epochs: "10" 2023-10-17 20:18:57,381 - shuffle: "True" 2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:18:57,381 Plugins: 2023-10-17 20:18:57,381 - TensorboardLogger 2023-10-17 20:18:57,381 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:18:57,381 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 20:18:57,381 - metric: "('micro avg', 'f1-score')" 2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:18:57,381 Computation: 2023-10-17 20:18:57,381 - compute on device: cuda:0 2023-10-17 20:18:57,381 - embedding storage: none 2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:18:57,381 Model training base path: "hmbench-newseye/sv-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:18:57,382 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 20:18:58,943 epoch 1 - iter 27/272 - loss 3.49842970 - time (sec): 1.56 - samples/sec: 3478.53 - lr: 0.000003 - momentum: 0.000000 2023-10-17 20:19:00,464 epoch 1 - iter 54/272 - loss 2.98285681 - time (sec): 3.08 - samples/sec: 3377.50 - lr: 0.000006 - momentum: 0.000000 2023-10-17 20:19:02,165 epoch 1 - iter 81/272 - loss 2.27413183 - time (sec): 4.78 - samples/sec: 3425.97 - lr: 0.000009 - momentum: 0.000000 2023-10-17 20:19:03,672 epoch 1 - iter 108/272 - loss 1.87548433 - time (sec): 6.29 - samples/sec: 3393.56 - lr: 0.000012 - momentum: 0.000000 2023-10-17 20:19:05,250 epoch 1 - iter 135/272 - loss 1.59662292 - time (sec): 7.87 - samples/sec: 3384.32 - lr: 0.000015 - momentum: 0.000000 2023-10-17 20:19:06,917 epoch 1 - iter 162/272 - loss 1.42750032 - time (sec): 9.53 - samples/sec: 3264.53 - lr: 0.000018 - momentum: 0.000000 2023-10-17 20:19:08,522 epoch 1 - iter 189/272 - loss 1.27044737 - time (sec): 11.14 - samples/sec: 3258.78 - lr: 0.000021 - momentum: 0.000000 2023-10-17 20:19:10,248 epoch 1 - iter 216/272 - loss 1.12732224 - time (sec): 12.87 - samples/sec: 3264.28 - lr: 0.000024 - momentum: 0.000000 2023-10-17 20:19:11,697 epoch 1 - iter 243/272 - loss 1.04421210 - time (sec): 14.31 - samples/sec: 3251.23 - lr: 0.000027 - momentum: 0.000000 2023-10-17 20:19:13,235 epoch 1 - iter 270/272 - loss 0.95643037 - time (sec): 15.85 - samples/sec: 3273.19 - lr: 0.000030 - momentum: 0.000000 2023-10-17 20:19:13,321 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:19:13,321 EPOCH 1 done: loss 0.9554 - lr: 0.000030 2023-10-17 20:19:14,468 DEV : loss 0.1645134836435318 - f1-score (micro avg) 0.6221 2023-10-17 20:19:14,472 saving best model 2023-10-17 20:19:14,839 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:19:16,494 epoch 2 - iter 27/272 - loss 0.27906667 - time (sec): 1.65 - samples/sec: 3027.02 - lr: 0.000030 - momentum: 0.000000 2023-10-17 20:19:18,125 epoch 2 - iter 54/272 - loss 0.20916835 - time (sec): 3.28 - samples/sec: 3170.13 - lr: 0.000029 - momentum: 0.000000 2023-10-17 20:19:19,712 epoch 2 - iter 81/272 - loss 0.20515887 - time (sec): 4.87 - samples/sec: 3329.71 - lr: 0.000029 - momentum: 0.000000 2023-10-17 20:19:21,311 epoch 2 - iter 108/272 - loss 0.18330045 - time (sec): 6.47 - samples/sec: 3383.09 - lr: 0.000029 - momentum: 0.000000 2023-10-17 20:19:22,801 epoch 2 - iter 135/272 - loss 0.18008481 - time (sec): 7.96 - samples/sec: 3243.11 - lr: 0.000028 - momentum: 0.000000 2023-10-17 20:19:24,480 epoch 2 - iter 162/272 - loss 0.17303691 - time (sec): 9.64 - samples/sec: 3259.98 - lr: 0.000028 - momentum: 0.000000 2023-10-17 20:19:26,003 epoch 2 - iter 189/272 - loss 0.16670430 - time (sec): 11.16 - samples/sec: 3277.79 - lr: 0.000028 - momentum: 0.000000 2023-10-17 20:19:27,536 epoch 2 - iter 216/272 - loss 0.16179671 - time (sec): 12.70 - samples/sec: 3264.74 - lr: 0.000027 - momentum: 0.000000 2023-10-17 20:19:29,127 epoch 2 - iter 243/272 - loss 0.15935516 - time (sec): 14.29 - samples/sec: 3306.06 - lr: 0.000027 - momentum: 0.000000 2023-10-17 20:19:30,585 epoch 2 - iter 270/272 - loss 0.16020763 - time (sec): 15.74 - samples/sec: 3278.58 - lr: 0.000027 - momentum: 0.000000 2023-10-17 20:19:30,724 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:19:30,724 EPOCH 2 done: loss 0.1593 - lr: 0.000027 2023-10-17 20:19:32,187 DEV : loss 0.11602584272623062 - f1-score (micro avg) 0.7569 2023-10-17 20:19:32,192 saving best model 2023-10-17 20:19:32,664 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:19:34,201 epoch 3 - iter 27/272 - loss 0.13769909 - time (sec): 1.54 - samples/sec: 3212.12 - lr: 0.000026 - momentum: 0.000000 2023-10-17 20:19:35,730 epoch 3 - iter 54/272 - loss 0.10710548 - time (sec): 3.06 - samples/sec: 3132.48 - lr: 0.000026 - momentum: 0.000000 2023-10-17 20:19:37,249 epoch 3 - iter 81/272 - loss 0.09860006 - time (sec): 4.58 - samples/sec: 3234.87 - lr: 0.000026 - momentum: 0.000000 2023-10-17 20:19:38,894 epoch 3 - iter 108/272 - loss 0.09871562 - time (sec): 6.23 - samples/sec: 3237.48 - lr: 0.000025 - momentum: 0.000000 2023-10-17 20:19:40,386 epoch 3 - iter 135/272 - loss 0.11131359 - time (sec): 7.72 - samples/sec: 3232.43 - lr: 0.000025 - momentum: 0.000000 2023-10-17 20:19:41,973 epoch 3 - iter 162/272 - loss 0.10238851 - time (sec): 9.31 - samples/sec: 3253.71 - lr: 0.000025 - momentum: 0.000000 2023-10-17 20:19:43,584 epoch 3 - iter 189/272 - loss 0.09521859 - time (sec): 10.92 - samples/sec: 3221.43 - lr: 0.000024 - momentum: 0.000000 2023-10-17 20:19:45,292 epoch 3 - iter 216/272 - loss 0.09422831 - time (sec): 12.63 - samples/sec: 3282.77 - lr: 0.000024 - momentum: 0.000000 2023-10-17 20:19:46,757 epoch 3 - iter 243/272 - loss 0.09501369 - time (sec): 14.09 - samples/sec: 3270.18 - lr: 0.000024 - momentum: 0.000000 2023-10-17 20:19:48,451 epoch 3 - iter 270/272 - loss 0.09190776 - time (sec): 15.79 - samples/sec: 3276.78 - lr: 0.000023 - momentum: 0.000000 2023-10-17 20:19:48,547 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:19:48,548 EPOCH 3 done: loss 0.0917 - lr: 0.000023 2023-10-17 20:19:50,028 DEV : loss 0.12322476506233215 - f1-score (micro avg) 0.7726 2023-10-17 20:19:50,033 saving best model 2023-10-17 20:19:50,522 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:19:52,218 epoch 4 - iter 27/272 - loss 0.05161412 - time (sec): 1.69 - samples/sec: 3503.45 - lr: 0.000023 - momentum: 0.000000 2023-10-17 20:19:53,715 epoch 4 - iter 54/272 - loss 0.06291621 - time (sec): 3.19 - samples/sec: 3369.62 - lr: 0.000023 - momentum: 0.000000 2023-10-17 20:19:55,380 epoch 4 - iter 81/272 - loss 0.05796083 - time (sec): 4.85 - samples/sec: 3516.64 - lr: 0.000022 - momentum: 0.000000 2023-10-17 20:19:57,049 epoch 4 - iter 108/272 - loss 0.05358689 - time (sec): 6.52 - samples/sec: 3485.59 - lr: 0.000022 - momentum: 0.000000 2023-10-17 20:19:58,571 epoch 4 - iter 135/272 - loss 0.05313206 - time (sec): 8.04 - samples/sec: 3431.03 - lr: 0.000022 - momentum: 0.000000 2023-10-17 20:20:00,053 epoch 4 - iter 162/272 - loss 0.05414281 - time (sec): 9.53 - samples/sec: 3360.02 - lr: 0.000021 - momentum: 0.000000 2023-10-17 20:20:01,527 epoch 4 - iter 189/272 - loss 0.05620792 - time (sec): 11.00 - samples/sec: 3330.51 - lr: 0.000021 - momentum: 0.000000 2023-10-17 20:20:03,138 epoch 4 - iter 216/272 - loss 0.05401160 - time (sec): 12.61 - samples/sec: 3350.59 - lr: 0.000021 - momentum: 0.000000 2023-10-17 20:20:04,708 epoch 4 - iter 243/272 - loss 0.05317398 - time (sec): 14.18 - samples/sec: 3318.36 - lr: 0.000020 - momentum: 0.000000 2023-10-17 20:20:06,264 epoch 4 - iter 270/272 - loss 0.05747365 - time (sec): 15.74 - samples/sec: 3292.19 - lr: 0.000020 - momentum: 0.000000 2023-10-17 20:20:06,357 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:06,357 EPOCH 4 done: loss 0.0574 - lr: 0.000020 2023-10-17 20:20:07,876 DEV : loss 0.10572109371423721 - f1-score (micro avg) 0.7957 2023-10-17 20:20:07,881 saving best model 2023-10-17 20:20:08,356 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:09,944 epoch 5 - iter 27/272 - loss 0.04695874 - time (sec): 1.58 - samples/sec: 3262.06 - lr: 0.000020 - momentum: 0.000000 2023-10-17 20:20:11,789 epoch 5 - iter 54/272 - loss 0.03793113 - time (sec): 3.43 - samples/sec: 3016.82 - lr: 0.000019 - momentum: 0.000000 2023-10-17 20:20:13,355 epoch 5 - iter 81/272 - loss 0.03745768 - time (sec): 5.00 - samples/sec: 3074.56 - lr: 0.000019 - momentum: 0.000000 2023-10-17 20:20:15,003 epoch 5 - iter 108/272 - loss 0.04081558 - time (sec): 6.64 - samples/sec: 3116.89 - lr: 0.000019 - momentum: 0.000000 2023-10-17 20:20:16,624 epoch 5 - iter 135/272 - loss 0.03983865 - time (sec): 8.27 - samples/sec: 3181.61 - lr: 0.000018 - momentum: 0.000000 2023-10-17 20:20:18,274 epoch 5 - iter 162/272 - loss 0.03836222 - time (sec): 9.91 - samples/sec: 3191.23 - lr: 0.000018 - momentum: 0.000000 2023-10-17 20:20:19,760 epoch 5 - iter 189/272 - loss 0.03794759 - time (sec): 11.40 - samples/sec: 3221.77 - lr: 0.000018 - momentum: 0.000000 2023-10-17 20:20:21,352 epoch 5 - iter 216/272 - loss 0.03723302 - time (sec): 12.99 - samples/sec: 3247.45 - lr: 0.000017 - momentum: 0.000000 2023-10-17 20:20:22,775 epoch 5 - iter 243/272 - loss 0.03647845 - time (sec): 14.42 - samples/sec: 3215.67 - lr: 0.000017 - momentum: 0.000000 2023-10-17 20:20:24,400 epoch 5 - iter 270/272 - loss 0.03615469 - time (sec): 16.04 - samples/sec: 3224.72 - lr: 0.000017 - momentum: 0.000000 2023-10-17 20:20:24,506 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:24,506 EPOCH 5 done: loss 0.0362 - lr: 0.000017 2023-10-17 20:20:26,058 DEV : loss 0.13827396929264069 - f1-score (micro avg) 0.8277 2023-10-17 20:20:26,067 saving best model 2023-10-17 20:20:26,535 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:28,215 epoch 6 - iter 27/272 - loss 0.02345061 - time (sec): 1.68 - samples/sec: 3515.65 - lr: 0.000016 - momentum: 0.000000 2023-10-17 20:20:29,742 epoch 6 - iter 54/272 - loss 0.02926596 - time (sec): 3.20 - samples/sec: 3445.42 - lr: 0.000016 - momentum: 0.000000 2023-10-17 20:20:31,347 epoch 6 - iter 81/272 - loss 0.02844781 - time (sec): 4.81 - samples/sec: 3429.11 - lr: 0.000016 - momentum: 0.000000 2023-10-17 20:20:32,957 epoch 6 - iter 108/272 - loss 0.02832356 - time (sec): 6.42 - samples/sec: 3439.69 - lr: 0.000015 - momentum: 0.000000 2023-10-17 20:20:34,468 epoch 6 - iter 135/272 - loss 0.02441837 - time (sec): 7.93 - samples/sec: 3422.38 - lr: 0.000015 - momentum: 0.000000 2023-10-17 20:20:36,011 epoch 6 - iter 162/272 - loss 0.02357592 - time (sec): 9.47 - samples/sec: 3354.70 - lr: 0.000015 - momentum: 0.000000 2023-10-17 20:20:37,620 epoch 6 - iter 189/272 - loss 0.02270423 - time (sec): 11.08 - samples/sec: 3339.31 - lr: 0.000014 - momentum: 0.000000 2023-10-17 20:20:39,247 epoch 6 - iter 216/272 - loss 0.02279667 - time (sec): 12.71 - samples/sec: 3328.67 - lr: 0.000014 - momentum: 0.000000 2023-10-17 20:20:40,859 epoch 6 - iter 243/272 - loss 0.02430342 - time (sec): 14.32 - samples/sec: 3268.37 - lr: 0.000014 - momentum: 0.000000 2023-10-17 20:20:42,548 epoch 6 - iter 270/272 - loss 0.02394392 - time (sec): 16.01 - samples/sec: 3242.86 - lr: 0.000013 - momentum: 0.000000 2023-10-17 20:20:42,640 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:42,640 EPOCH 6 done: loss 0.0240 - lr: 0.000013 2023-10-17 20:20:44,131 DEV : loss 0.1619756519794464 - f1-score (micro avg) 0.8312 2023-10-17 20:20:44,137 saving best model 2023-10-17 20:20:44,626 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:46,271 epoch 7 - iter 27/272 - loss 0.02264084 - time (sec): 1.64 - samples/sec: 3113.51 - lr: 0.000013 - momentum: 0.000000 2023-10-17 20:20:47,825 epoch 7 - iter 54/272 - loss 0.02071817 - time (sec): 3.20 - samples/sec: 3019.35 - lr: 0.000013 - momentum: 0.000000 2023-10-17 20:20:49,406 epoch 7 - iter 81/272 - loss 0.02343202 - time (sec): 4.78 - samples/sec: 3152.15 - lr: 0.000012 - momentum: 0.000000 2023-10-17 20:20:50,905 epoch 7 - iter 108/272 - loss 0.01975467 - time (sec): 6.28 - samples/sec: 3177.67 - lr: 0.000012 - momentum: 0.000000 2023-10-17 20:20:52,452 epoch 7 - iter 135/272 - loss 0.01965074 - time (sec): 7.82 - samples/sec: 3205.69 - lr: 0.000012 - momentum: 0.000000 2023-10-17 20:20:53,980 epoch 7 - iter 162/272 - loss 0.01819110 - time (sec): 9.35 - samples/sec: 3284.82 - lr: 0.000011 - momentum: 0.000000 2023-10-17 20:20:55,502 epoch 7 - iter 189/272 - loss 0.01711362 - time (sec): 10.87 - samples/sec: 3263.90 - lr: 0.000011 - momentum: 0.000000 2023-10-17 20:20:57,235 epoch 7 - iter 216/272 - loss 0.01721239 - time (sec): 12.61 - samples/sec: 3280.85 - lr: 0.000011 - momentum: 0.000000 2023-10-17 20:20:58,851 epoch 7 - iter 243/272 - loss 0.01814544 - time (sec): 14.22 - samples/sec: 3309.19 - lr: 0.000010 - momentum: 0.000000 2023-10-17 20:21:00,340 epoch 7 - iter 270/272 - loss 0.01871976 - time (sec): 15.71 - samples/sec: 3287.86 - lr: 0.000010 - momentum: 0.000000 2023-10-17 20:21:00,436 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:21:00,436 EPOCH 7 done: loss 0.0192 - lr: 0.000010 2023-10-17 20:21:01,886 DEV : loss 0.17093084752559662 - f1-score (micro avg) 0.8014 2023-10-17 20:21:01,891 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:21:03,667 epoch 8 - iter 27/272 - loss 0.03034949 - time (sec): 1.77 - samples/sec: 3367.94 - lr: 0.000010 - momentum: 0.000000 2023-10-17 20:21:05,441 epoch 8 - iter 54/272 - loss 0.01887047 - time (sec): 3.55 - samples/sec: 3465.10 - lr: 0.000009 - momentum: 0.000000 2023-10-17 20:21:07,139 epoch 8 - iter 81/272 - loss 0.01601346 - time (sec): 5.25 - samples/sec: 3414.90 - lr: 0.000009 - momentum: 0.000000 2023-10-17 20:21:08,606 epoch 8 - iter 108/272 - loss 0.01694315 - time (sec): 6.71 - samples/sec: 3413.74 - lr: 0.000009 - momentum: 0.000000 2023-10-17 20:21:10,142 epoch 8 - iter 135/272 - loss 0.01991374 - time (sec): 8.25 - samples/sec: 3346.38 - lr: 0.000008 - momentum: 0.000000 2023-10-17 20:21:11,696 epoch 8 - iter 162/272 - loss 0.01753446 - time (sec): 9.80 - samples/sec: 3335.44 - lr: 0.000008 - momentum: 0.000000 2023-10-17 20:21:13,344 epoch 8 - iter 189/272 - loss 0.01608672 - time (sec): 11.45 - samples/sec: 3325.17 - lr: 0.000008 - momentum: 0.000000 2023-10-17 20:21:14,722 epoch 8 - iter 216/272 - loss 0.01646087 - time (sec): 12.83 - samples/sec: 3273.02 - lr: 0.000007 - momentum: 0.000000 2023-10-17 20:21:16,217 epoch 8 - iter 243/272 - loss 0.01560080 - time (sec): 14.32 - samples/sec: 3279.05 - lr: 0.000007 - momentum: 0.000000 2023-10-17 20:21:17,702 epoch 8 - iter 270/272 - loss 0.01463536 - time (sec): 15.81 - samples/sec: 3279.79 - lr: 0.000007 - momentum: 0.000000 2023-10-17 20:21:17,793 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:21:17,793 EPOCH 8 done: loss 0.0146 - lr: 0.000007 2023-10-17 20:21:19,314 DEV : loss 0.1792607456445694 - f1-score (micro avg) 0.8324 2023-10-17 20:21:19,319 saving best model 2023-10-17 20:21:19,789 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:21:21,380 epoch 9 - iter 27/272 - loss 0.01112128 - time (sec): 1.59 - samples/sec: 3317.84 - lr: 0.000006 - momentum: 0.000000 2023-10-17 20:21:22,943 epoch 9 - iter 54/272 - loss 0.01026673 - time (sec): 3.15 - samples/sec: 3421.86 - lr: 0.000006 - momentum: 0.000000 2023-10-17 20:21:24,647 epoch 9 - iter 81/272 - loss 0.00770384 - time (sec): 4.86 - samples/sec: 3345.55 - lr: 0.000006 - momentum: 0.000000 2023-10-17 20:21:26,199 epoch 9 - iter 108/272 - loss 0.01015661 - time (sec): 6.41 - samples/sec: 3332.79 - lr: 0.000005 - momentum: 0.000000 2023-10-17 20:21:28,030 epoch 9 - iter 135/272 - loss 0.00954074 - time (sec): 8.24 - samples/sec: 3254.71 - lr: 0.000005 - momentum: 0.000000 2023-10-17 20:21:29,635 epoch 9 - iter 162/272 - loss 0.00918188 - time (sec): 9.84 - samples/sec: 3221.84 - lr: 0.000005 - momentum: 0.000000 2023-10-17 20:21:31,245 epoch 9 - iter 189/272 - loss 0.00883662 - time (sec): 11.45 - samples/sec: 3202.32 - lr: 0.000004 - momentum: 0.000000 2023-10-17 20:21:32,954 epoch 9 - iter 216/272 - loss 0.00910147 - time (sec): 13.16 - samples/sec: 3195.23 - lr: 0.000004 - momentum: 0.000000 2023-10-17 20:21:34,526 epoch 9 - iter 243/272 - loss 0.00857958 - time (sec): 14.74 - samples/sec: 3191.61 - lr: 0.000004 - momentum: 0.000000 2023-10-17 20:21:36,049 epoch 9 - iter 270/272 - loss 0.01017541 - time (sec): 16.26 - samples/sec: 3181.49 - lr: 0.000003 - momentum: 0.000000 2023-10-17 20:21:36,152 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:21:36,152 EPOCH 9 done: loss 0.0101 - lr: 0.000003 2023-10-17 20:21:37,635 DEV : loss 0.17717301845550537 - f1-score (micro avg) 0.8349 2023-10-17 20:21:37,640 saving best model 2023-10-17 20:21:38,109 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:21:39,597 epoch 10 - iter 27/272 - loss 0.01678163 - time (sec): 1.49 - samples/sec: 3143.36 - lr: 0.000003 - momentum: 0.000000 2023-10-17 20:21:41,151 epoch 10 - iter 54/272 - loss 0.00835340 - time (sec): 3.04 - samples/sec: 3149.76 - lr: 0.000003 - momentum: 0.000000 2023-10-17 20:21:42,706 epoch 10 - iter 81/272 - loss 0.00669184 - time (sec): 4.60 - samples/sec: 3171.36 - lr: 0.000002 - momentum: 0.000000 2023-10-17 20:21:44,283 epoch 10 - iter 108/272 - loss 0.00651137 - time (sec): 6.17 - samples/sec: 3174.16 - lr: 0.000002 - momentum: 0.000000 2023-10-17 20:21:45,753 epoch 10 - iter 135/272 - loss 0.00753126 - time (sec): 7.64 - samples/sec: 3159.18 - lr: 0.000002 - momentum: 0.000000 2023-10-17 20:21:47,290 epoch 10 - iter 162/272 - loss 0.00735858 - time (sec): 9.18 - samples/sec: 3222.10 - lr: 0.000001 - momentum: 0.000000 2023-10-17 20:21:48,903 epoch 10 - iter 189/272 - loss 0.00770349 - time (sec): 10.79 - samples/sec: 3239.98 - lr: 0.000001 - momentum: 0.000000 2023-10-17 20:21:50,616 epoch 10 - iter 216/272 - loss 0.00858246 - time (sec): 12.51 - samples/sec: 3235.47 - lr: 0.000001 - momentum: 0.000000 2023-10-17 20:21:52,438 epoch 10 - iter 243/272 - loss 0.00873740 - time (sec): 14.33 - samples/sec: 3221.40 - lr: 0.000000 - momentum: 0.000000 2023-10-17 20:21:54,049 epoch 10 - iter 270/272 - loss 0.00867997 - time (sec): 15.94 - samples/sec: 3247.88 - lr: 0.000000 - momentum: 0.000000 2023-10-17 20:21:54,154 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:21:54,155 EPOCH 10 done: loss 0.0087 - lr: 0.000000 2023-10-17 20:21:55,641 DEV : loss 0.1778741031885147 - f1-score (micro avg) 0.8367 2023-10-17 20:21:55,646 saving best model 2023-10-17 20:21:56,611 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:21:56,613 Loading model from best epoch ... 2023-10-17 20:21:58,459 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-17 20:22:00,749 Results: - F-score (micro) 0.7949 - F-score (macro) 0.7491 - Accuracy 0.6785 By class: precision recall f1-score support LOC 0.8171 0.8590 0.8375 312 PER 0.7194 0.8750 0.7896 208 ORG 0.5745 0.4909 0.5294 55 HumanProd 0.7500 0.9545 0.8400 22 micro avg 0.7591 0.8342 0.7949 597 macro avg 0.7152 0.7949 0.7491 597 weighted avg 0.7582 0.8342 0.7925 597 2023-10-17 20:22:00,749 ----------------------------------------------------------------------------------------------------