|
2023-10-17 20:18:57,380 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:18:57,380 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:18:57,381 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:18:57,381 Train: 1085 sentences |
|
2023-10-17 20:18:57,381 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:18:57,381 Training Params: |
|
2023-10-17 20:18:57,381 - learning_rate: "3e-05" |
|
2023-10-17 20:18:57,381 - mini_batch_size: "4" |
|
2023-10-17 20:18:57,381 - max_epochs: "10" |
|
2023-10-17 20:18:57,381 - shuffle: "True" |
|
2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:18:57,381 Plugins: |
|
2023-10-17 20:18:57,381 - TensorboardLogger |
|
2023-10-17 20:18:57,381 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:18:57,381 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 20:18:57,381 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:18:57,381 Computation: |
|
2023-10-17 20:18:57,381 - compute on device: cuda:0 |
|
2023-10-17 20:18:57,381 - embedding storage: none |
|
2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:18:57,381 Model training base path: "hmbench-newseye/sv-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:18:57,381 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:18:57,382 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 20:18:58,943 epoch 1 - iter 27/272 - loss 3.49842970 - time (sec): 1.56 - samples/sec: 3478.53 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:19:00,464 epoch 1 - iter 54/272 - loss 2.98285681 - time (sec): 3.08 - samples/sec: 3377.50 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:19:02,165 epoch 1 - iter 81/272 - loss 2.27413183 - time (sec): 4.78 - samples/sec: 3425.97 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:19:03,672 epoch 1 - iter 108/272 - loss 1.87548433 - time (sec): 6.29 - samples/sec: 3393.56 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:19:05,250 epoch 1 - iter 135/272 - loss 1.59662292 - time (sec): 7.87 - samples/sec: 3384.32 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:19:06,917 epoch 1 - iter 162/272 - loss 1.42750032 - time (sec): 9.53 - samples/sec: 3264.53 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:19:08,522 epoch 1 - iter 189/272 - loss 1.27044737 - time (sec): 11.14 - samples/sec: 3258.78 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:19:10,248 epoch 1 - iter 216/272 - loss 1.12732224 - time (sec): 12.87 - samples/sec: 3264.28 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:19:11,697 epoch 1 - iter 243/272 - loss 1.04421210 - time (sec): 14.31 - samples/sec: 3251.23 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:19:13,235 epoch 1 - iter 270/272 - loss 0.95643037 - time (sec): 15.85 - samples/sec: 3273.19 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 20:19:13,321 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:19:13,321 EPOCH 1 done: loss 0.9554 - lr: 0.000030 |
|
2023-10-17 20:19:14,468 DEV : loss 0.1645134836435318 - f1-score (micro avg) 0.6221 |
|
2023-10-17 20:19:14,472 saving best model |
|
2023-10-17 20:19:14,839 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:19:16,494 epoch 2 - iter 27/272 - loss 0.27906667 - time (sec): 1.65 - samples/sec: 3027.02 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 20:19:18,125 epoch 2 - iter 54/272 - loss 0.20916835 - time (sec): 3.28 - samples/sec: 3170.13 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:19:19,712 epoch 2 - iter 81/272 - loss 0.20515887 - time (sec): 4.87 - samples/sec: 3329.71 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:19:21,311 epoch 2 - iter 108/272 - loss 0.18330045 - time (sec): 6.47 - samples/sec: 3383.09 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:19:22,801 epoch 2 - iter 135/272 - loss 0.18008481 - time (sec): 7.96 - samples/sec: 3243.11 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:19:24,480 epoch 2 - iter 162/272 - loss 0.17303691 - time (sec): 9.64 - samples/sec: 3259.98 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:19:26,003 epoch 2 - iter 189/272 - loss 0.16670430 - time (sec): 11.16 - samples/sec: 3277.79 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:19:27,536 epoch 2 - iter 216/272 - loss 0.16179671 - time (sec): 12.70 - samples/sec: 3264.74 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:19:29,127 epoch 2 - iter 243/272 - loss 0.15935516 - time (sec): 14.29 - samples/sec: 3306.06 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:19:30,585 epoch 2 - iter 270/272 - loss 0.16020763 - time (sec): 15.74 - samples/sec: 3278.58 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:19:30,724 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:19:30,724 EPOCH 2 done: loss 0.1593 - lr: 0.000027 |
|
2023-10-17 20:19:32,187 DEV : loss 0.11602584272623062 - f1-score (micro avg) 0.7569 |
|
2023-10-17 20:19:32,192 saving best model |
|
2023-10-17 20:19:32,664 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:19:34,201 epoch 3 - iter 27/272 - loss 0.13769909 - time (sec): 1.54 - samples/sec: 3212.12 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:19:35,730 epoch 3 - iter 54/272 - loss 0.10710548 - time (sec): 3.06 - samples/sec: 3132.48 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:19:37,249 epoch 3 - iter 81/272 - loss 0.09860006 - time (sec): 4.58 - samples/sec: 3234.87 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:19:38,894 epoch 3 - iter 108/272 - loss 0.09871562 - time (sec): 6.23 - samples/sec: 3237.48 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:19:40,386 epoch 3 - iter 135/272 - loss 0.11131359 - time (sec): 7.72 - samples/sec: 3232.43 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:19:41,973 epoch 3 - iter 162/272 - loss 0.10238851 - time (sec): 9.31 - samples/sec: 3253.71 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:19:43,584 epoch 3 - iter 189/272 - loss 0.09521859 - time (sec): 10.92 - samples/sec: 3221.43 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:19:45,292 epoch 3 - iter 216/272 - loss 0.09422831 - time (sec): 12.63 - samples/sec: 3282.77 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:19:46,757 epoch 3 - iter 243/272 - loss 0.09501369 - time (sec): 14.09 - samples/sec: 3270.18 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:19:48,451 epoch 3 - iter 270/272 - loss 0.09190776 - time (sec): 15.79 - samples/sec: 3276.78 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:19:48,547 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:19:48,548 EPOCH 3 done: loss 0.0917 - lr: 0.000023 |
|
2023-10-17 20:19:50,028 DEV : loss 0.12322476506233215 - f1-score (micro avg) 0.7726 |
|
2023-10-17 20:19:50,033 saving best model |
|
2023-10-17 20:19:50,522 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:19:52,218 epoch 4 - iter 27/272 - loss 0.05161412 - time (sec): 1.69 - samples/sec: 3503.45 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:19:53,715 epoch 4 - iter 54/272 - loss 0.06291621 - time (sec): 3.19 - samples/sec: 3369.62 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:19:55,380 epoch 4 - iter 81/272 - loss 0.05796083 - time (sec): 4.85 - samples/sec: 3516.64 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 20:19:57,049 epoch 4 - iter 108/272 - loss 0.05358689 - time (sec): 6.52 - samples/sec: 3485.59 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 20:19:58,571 epoch 4 - iter 135/272 - loss 0.05313206 - time (sec): 8.04 - samples/sec: 3431.03 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 20:20:00,053 epoch 4 - iter 162/272 - loss 0.05414281 - time (sec): 9.53 - samples/sec: 3360.02 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:20:01,527 epoch 4 - iter 189/272 - loss 0.05620792 - time (sec): 11.00 - samples/sec: 3330.51 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:20:03,138 epoch 4 - iter 216/272 - loss 0.05401160 - time (sec): 12.61 - samples/sec: 3350.59 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:20:04,708 epoch 4 - iter 243/272 - loss 0.05317398 - time (sec): 14.18 - samples/sec: 3318.36 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:20:06,264 epoch 4 - iter 270/272 - loss 0.05747365 - time (sec): 15.74 - samples/sec: 3292.19 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:20:06,357 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:20:06,357 EPOCH 4 done: loss 0.0574 - lr: 0.000020 |
|
2023-10-17 20:20:07,876 DEV : loss 0.10572109371423721 - f1-score (micro avg) 0.7957 |
|
2023-10-17 20:20:07,881 saving best model |
|
2023-10-17 20:20:08,356 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:20:09,944 epoch 5 - iter 27/272 - loss 0.04695874 - time (sec): 1.58 - samples/sec: 3262.06 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:20:11,789 epoch 5 - iter 54/272 - loss 0.03793113 - time (sec): 3.43 - samples/sec: 3016.82 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:20:13,355 epoch 5 - iter 81/272 - loss 0.03745768 - time (sec): 5.00 - samples/sec: 3074.56 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:20:15,003 epoch 5 - iter 108/272 - loss 0.04081558 - time (sec): 6.64 - samples/sec: 3116.89 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:20:16,624 epoch 5 - iter 135/272 - loss 0.03983865 - time (sec): 8.27 - samples/sec: 3181.61 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:20:18,274 epoch 5 - iter 162/272 - loss 0.03836222 - time (sec): 9.91 - samples/sec: 3191.23 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:20:19,760 epoch 5 - iter 189/272 - loss 0.03794759 - time (sec): 11.40 - samples/sec: 3221.77 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:20:21,352 epoch 5 - iter 216/272 - loss 0.03723302 - time (sec): 12.99 - samples/sec: 3247.45 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 20:20:22,775 epoch 5 - iter 243/272 - loss 0.03647845 - time (sec): 14.42 - samples/sec: 3215.67 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 20:20:24,400 epoch 5 - iter 270/272 - loss 0.03615469 - time (sec): 16.04 - samples/sec: 3224.72 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 20:20:24,506 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:20:24,506 EPOCH 5 done: loss 0.0362 - lr: 0.000017 |
|
2023-10-17 20:20:26,058 DEV : loss 0.13827396929264069 - f1-score (micro avg) 0.8277 |
|
2023-10-17 20:20:26,067 saving best model |
|
2023-10-17 20:20:26,535 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:20:28,215 epoch 6 - iter 27/272 - loss 0.02345061 - time (sec): 1.68 - samples/sec: 3515.65 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:20:29,742 epoch 6 - iter 54/272 - loss 0.02926596 - time (sec): 3.20 - samples/sec: 3445.42 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:20:31,347 epoch 6 - iter 81/272 - loss 0.02844781 - time (sec): 4.81 - samples/sec: 3429.11 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:20:32,957 epoch 6 - iter 108/272 - loss 0.02832356 - time (sec): 6.42 - samples/sec: 3439.69 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:20:34,468 epoch 6 - iter 135/272 - loss 0.02441837 - time (sec): 7.93 - samples/sec: 3422.38 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:20:36,011 epoch 6 - iter 162/272 - loss 0.02357592 - time (sec): 9.47 - samples/sec: 3354.70 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:20:37,620 epoch 6 - iter 189/272 - loss 0.02270423 - time (sec): 11.08 - samples/sec: 3339.31 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:20:39,247 epoch 6 - iter 216/272 - loss 0.02279667 - time (sec): 12.71 - samples/sec: 3328.67 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:20:40,859 epoch 6 - iter 243/272 - loss 0.02430342 - time (sec): 14.32 - samples/sec: 3268.37 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:20:42,548 epoch 6 - iter 270/272 - loss 0.02394392 - time (sec): 16.01 - samples/sec: 3242.86 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:20:42,640 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:20:42,640 EPOCH 6 done: loss 0.0240 - lr: 0.000013 |
|
2023-10-17 20:20:44,131 DEV : loss 0.1619756519794464 - f1-score (micro avg) 0.8312 |
|
2023-10-17 20:20:44,137 saving best model |
|
2023-10-17 20:20:44,626 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:20:46,271 epoch 7 - iter 27/272 - loss 0.02264084 - time (sec): 1.64 - samples/sec: 3113.51 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:20:47,825 epoch 7 - iter 54/272 - loss 0.02071817 - time (sec): 3.20 - samples/sec: 3019.35 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:20:49,406 epoch 7 - iter 81/272 - loss 0.02343202 - time (sec): 4.78 - samples/sec: 3152.15 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:20:50,905 epoch 7 - iter 108/272 - loss 0.01975467 - time (sec): 6.28 - samples/sec: 3177.67 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:20:52,452 epoch 7 - iter 135/272 - loss 0.01965074 - time (sec): 7.82 - samples/sec: 3205.69 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:20:53,980 epoch 7 - iter 162/272 - loss 0.01819110 - time (sec): 9.35 - samples/sec: 3284.82 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:20:55,502 epoch 7 - iter 189/272 - loss 0.01711362 - time (sec): 10.87 - samples/sec: 3263.90 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:20:57,235 epoch 7 - iter 216/272 - loss 0.01721239 - time (sec): 12.61 - samples/sec: 3280.85 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:20:58,851 epoch 7 - iter 243/272 - loss 0.01814544 - time (sec): 14.22 - samples/sec: 3309.19 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:21:00,340 epoch 7 - iter 270/272 - loss 0.01871976 - time (sec): 15.71 - samples/sec: 3287.86 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:21:00,436 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:21:00,436 EPOCH 7 done: loss 0.0192 - lr: 0.000010 |
|
2023-10-17 20:21:01,886 DEV : loss 0.17093084752559662 - f1-score (micro avg) 0.8014 |
|
2023-10-17 20:21:01,891 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:21:03,667 epoch 8 - iter 27/272 - loss 0.03034949 - time (sec): 1.77 - samples/sec: 3367.94 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:21:05,441 epoch 8 - iter 54/272 - loss 0.01887047 - time (sec): 3.55 - samples/sec: 3465.10 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:21:07,139 epoch 8 - iter 81/272 - loss 0.01601346 - time (sec): 5.25 - samples/sec: 3414.90 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:21:08,606 epoch 8 - iter 108/272 - loss 0.01694315 - time (sec): 6.71 - samples/sec: 3413.74 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:21:10,142 epoch 8 - iter 135/272 - loss 0.01991374 - time (sec): 8.25 - samples/sec: 3346.38 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:21:11,696 epoch 8 - iter 162/272 - loss 0.01753446 - time (sec): 9.80 - samples/sec: 3335.44 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:21:13,344 epoch 8 - iter 189/272 - loss 0.01608672 - time (sec): 11.45 - samples/sec: 3325.17 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:21:14,722 epoch 8 - iter 216/272 - loss 0.01646087 - time (sec): 12.83 - samples/sec: 3273.02 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:21:16,217 epoch 8 - iter 243/272 - loss 0.01560080 - time (sec): 14.32 - samples/sec: 3279.05 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:21:17,702 epoch 8 - iter 270/272 - loss 0.01463536 - time (sec): 15.81 - samples/sec: 3279.79 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:21:17,793 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:21:17,793 EPOCH 8 done: loss 0.0146 - lr: 0.000007 |
|
2023-10-17 20:21:19,314 DEV : loss 0.1792607456445694 - f1-score (micro avg) 0.8324 |
|
2023-10-17 20:21:19,319 saving best model |
|
2023-10-17 20:21:19,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:21:21,380 epoch 9 - iter 27/272 - loss 0.01112128 - time (sec): 1.59 - samples/sec: 3317.84 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:21:22,943 epoch 9 - iter 54/272 - loss 0.01026673 - time (sec): 3.15 - samples/sec: 3421.86 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:21:24,647 epoch 9 - iter 81/272 - loss 0.00770384 - time (sec): 4.86 - samples/sec: 3345.55 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:21:26,199 epoch 9 - iter 108/272 - loss 0.01015661 - time (sec): 6.41 - samples/sec: 3332.79 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:21:28,030 epoch 9 - iter 135/272 - loss 0.00954074 - time (sec): 8.24 - samples/sec: 3254.71 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:21:29,635 epoch 9 - iter 162/272 - loss 0.00918188 - time (sec): 9.84 - samples/sec: 3221.84 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:21:31,245 epoch 9 - iter 189/272 - loss 0.00883662 - time (sec): 11.45 - samples/sec: 3202.32 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:21:32,954 epoch 9 - iter 216/272 - loss 0.00910147 - time (sec): 13.16 - samples/sec: 3195.23 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:21:34,526 epoch 9 - iter 243/272 - loss 0.00857958 - time (sec): 14.74 - samples/sec: 3191.61 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:21:36,049 epoch 9 - iter 270/272 - loss 0.01017541 - time (sec): 16.26 - samples/sec: 3181.49 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:21:36,152 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:21:36,152 EPOCH 9 done: loss 0.0101 - lr: 0.000003 |
|
2023-10-17 20:21:37,635 DEV : loss 0.17717301845550537 - f1-score (micro avg) 0.8349 |
|
2023-10-17 20:21:37,640 saving best model |
|
2023-10-17 20:21:38,109 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:21:39,597 epoch 10 - iter 27/272 - loss 0.01678163 - time (sec): 1.49 - samples/sec: 3143.36 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:21:41,151 epoch 10 - iter 54/272 - loss 0.00835340 - time (sec): 3.04 - samples/sec: 3149.76 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:21:42,706 epoch 10 - iter 81/272 - loss 0.00669184 - time (sec): 4.60 - samples/sec: 3171.36 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:21:44,283 epoch 10 - iter 108/272 - loss 0.00651137 - time (sec): 6.17 - samples/sec: 3174.16 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:21:45,753 epoch 10 - iter 135/272 - loss 0.00753126 - time (sec): 7.64 - samples/sec: 3159.18 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:21:47,290 epoch 10 - iter 162/272 - loss 0.00735858 - time (sec): 9.18 - samples/sec: 3222.10 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:21:48,903 epoch 10 - iter 189/272 - loss 0.00770349 - time (sec): 10.79 - samples/sec: 3239.98 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:21:50,616 epoch 10 - iter 216/272 - loss 0.00858246 - time (sec): 12.51 - samples/sec: 3235.47 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:21:52,438 epoch 10 - iter 243/272 - loss 0.00873740 - time (sec): 14.33 - samples/sec: 3221.40 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 20:21:54,049 epoch 10 - iter 270/272 - loss 0.00867997 - time (sec): 15.94 - samples/sec: 3247.88 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 20:21:54,154 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:21:54,155 EPOCH 10 done: loss 0.0087 - lr: 0.000000 |
|
2023-10-17 20:21:55,641 DEV : loss 0.1778741031885147 - f1-score (micro avg) 0.8367 |
|
2023-10-17 20:21:55,646 saving best model |
|
2023-10-17 20:21:56,611 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:21:56,613 Loading model from best epoch ... |
|
2023-10-17 20:21:58,459 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-17 20:22:00,749 |
|
Results: |
|
- F-score (micro) 0.7949 |
|
- F-score (macro) 0.7491 |
|
- Accuracy 0.6785 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8171 0.8590 0.8375 312 |
|
PER 0.7194 0.8750 0.7896 208 |
|
ORG 0.5745 0.4909 0.5294 55 |
|
HumanProd 0.7500 0.9545 0.8400 22 |
|
|
|
micro avg 0.7591 0.8342 0.7949 597 |
|
macro avg 0.7152 0.7949 0.7491 597 |
|
weighted avg 0.7582 0.8342 0.7925 597 |
|
|
|
2023-10-17 20:22:00,749 ---------------------------------------------------------------------------------------------------- |
|
|