|
2023-10-17 20:35:47,610 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:35:47,611 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 20:35:47,611 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:35:47,611 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-17 20:35:47,611 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:35:47,611 Train: 1085 sentences |
|
2023-10-17 20:35:47,611 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 20:35:47,611 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:35:47,611 Training Params: |
|
2023-10-17 20:35:47,611 - learning_rate: "5e-05" |
|
2023-10-17 20:35:47,611 - mini_batch_size: "4" |
|
2023-10-17 20:35:47,611 - max_epochs: "10" |
|
2023-10-17 20:35:47,611 - shuffle: "True" |
|
2023-10-17 20:35:47,611 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:35:47,612 Plugins: |
|
2023-10-17 20:35:47,612 - TensorboardLogger |
|
2023-10-17 20:35:47,612 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 20:35:47,612 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:35:47,612 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 20:35:47,612 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 20:35:47,612 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:35:47,612 Computation: |
|
2023-10-17 20:35:47,612 - compute on device: cuda:0 |
|
2023-10-17 20:35:47,612 - embedding storage: none |
|
2023-10-17 20:35:47,612 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:35:47,612 Model training base path: "hmbench-newseye/sv-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-17 20:35:47,612 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:35:47,612 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:35:47,612 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 20:35:49,093 epoch 1 - iter 27/272 - loss 3.53390037 - time (sec): 1.48 - samples/sec: 3477.51 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:35:50,726 epoch 1 - iter 54/272 - loss 2.67097506 - time (sec): 3.11 - samples/sec: 3378.85 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:35:52,434 epoch 1 - iter 81/272 - loss 1.86529559 - time (sec): 4.82 - samples/sec: 3488.89 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:35:54,050 epoch 1 - iter 108/272 - loss 1.55027338 - time (sec): 6.44 - samples/sec: 3469.83 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:35:55,620 epoch 1 - iter 135/272 - loss 1.34824134 - time (sec): 8.01 - samples/sec: 3376.95 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:35:57,159 epoch 1 - iter 162/272 - loss 1.19909600 - time (sec): 9.55 - samples/sec: 3324.44 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 20:35:58,661 epoch 1 - iter 189/272 - loss 1.07641544 - time (sec): 11.05 - samples/sec: 3283.61 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 20:36:00,159 epoch 1 - iter 216/272 - loss 0.96694657 - time (sec): 12.55 - samples/sec: 3309.85 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 20:36:01,754 epoch 1 - iter 243/272 - loss 0.88212128 - time (sec): 14.14 - samples/sec: 3298.55 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 20:36:03,303 epoch 1 - iter 270/272 - loss 0.80966439 - time (sec): 15.69 - samples/sec: 3297.60 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 20:36:03,409 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:36:03,409 EPOCH 1 done: loss 0.8068 - lr: 0.000049 |
|
2023-10-17 20:36:04,535 DEV : loss 0.1529892534017563 - f1-score (micro avg) 0.6643 |
|
2023-10-17 20:36:04,542 saving best model |
|
2023-10-17 20:36:04,971 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:36:06,469 epoch 2 - iter 27/272 - loss 0.14767352 - time (sec): 1.50 - samples/sec: 3115.19 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 20:36:08,018 epoch 2 - iter 54/272 - loss 0.16063742 - time (sec): 3.05 - samples/sec: 3038.42 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 20:36:09,566 epoch 2 - iter 81/272 - loss 0.16994613 - time (sec): 4.59 - samples/sec: 3214.45 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 20:36:11,221 epoch 2 - iter 108/272 - loss 0.15666853 - time (sec): 6.25 - samples/sec: 3282.23 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 20:36:12,614 epoch 2 - iter 135/272 - loss 0.14680066 - time (sec): 7.64 - samples/sec: 3244.19 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 20:36:14,288 epoch 2 - iter 162/272 - loss 0.14317612 - time (sec): 9.32 - samples/sec: 3254.11 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 20:36:15,864 epoch 2 - iter 189/272 - loss 0.15532115 - time (sec): 10.89 - samples/sec: 3265.86 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 20:36:17,327 epoch 2 - iter 216/272 - loss 0.15551244 - time (sec): 12.36 - samples/sec: 3243.85 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 20:36:19,025 epoch 2 - iter 243/272 - loss 0.15283392 - time (sec): 14.05 - samples/sec: 3285.13 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 20:36:20,720 epoch 2 - iter 270/272 - loss 0.14764598 - time (sec): 15.75 - samples/sec: 3292.44 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 20:36:20,809 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:36:20,810 EPOCH 2 done: loss 0.1474 - lr: 0.000045 |
|
2023-10-17 20:36:22,282 DEV : loss 0.14341691136360168 - f1-score (micro avg) 0.7414 |
|
2023-10-17 20:36:22,288 saving best model |
|
2023-10-17 20:36:22,888 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:36:24,347 epoch 3 - iter 27/272 - loss 0.08105871 - time (sec): 1.46 - samples/sec: 2973.91 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 20:36:25,969 epoch 3 - iter 54/272 - loss 0.07398035 - time (sec): 3.08 - samples/sec: 3278.31 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 20:36:27,528 epoch 3 - iter 81/272 - loss 0.07599135 - time (sec): 4.64 - samples/sec: 3255.74 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 20:36:29,090 epoch 3 - iter 108/272 - loss 0.09258214 - time (sec): 6.20 - samples/sec: 3274.26 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 20:36:30,658 epoch 3 - iter 135/272 - loss 0.08730281 - time (sec): 7.77 - samples/sec: 3302.81 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 20:36:32,300 epoch 3 - iter 162/272 - loss 0.08665892 - time (sec): 9.41 - samples/sec: 3324.83 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 20:36:33,791 epoch 3 - iter 189/272 - loss 0.08469191 - time (sec): 10.90 - samples/sec: 3307.62 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 20:36:35,435 epoch 3 - iter 216/272 - loss 0.08075937 - time (sec): 12.54 - samples/sec: 3294.46 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 20:36:37,082 epoch 3 - iter 243/272 - loss 0.07970563 - time (sec): 14.19 - samples/sec: 3318.68 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 20:36:38,592 epoch 3 - iter 270/272 - loss 0.08167519 - time (sec): 15.70 - samples/sec: 3295.80 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 20:36:38,675 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:36:38,676 EPOCH 3 done: loss 0.0816 - lr: 0.000039 |
|
2023-10-17 20:36:40,353 DEV : loss 0.13894791901111603 - f1-score (micro avg) 0.7753 |
|
2023-10-17 20:36:40,358 saving best model |
|
2023-10-17 20:36:40,820 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:36:42,475 epoch 4 - iter 27/272 - loss 0.05336989 - time (sec): 1.65 - samples/sec: 3022.32 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 20:36:44,165 epoch 4 - iter 54/272 - loss 0.04979389 - time (sec): 3.34 - samples/sec: 3059.63 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 20:36:45,743 epoch 4 - iter 81/272 - loss 0.04856889 - time (sec): 4.92 - samples/sec: 3077.59 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 20:36:47,403 epoch 4 - iter 108/272 - loss 0.04541840 - time (sec): 6.58 - samples/sec: 3158.11 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 20:36:49,054 epoch 4 - iter 135/272 - loss 0.04714909 - time (sec): 8.23 - samples/sec: 3228.35 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 20:36:50,595 epoch 4 - iter 162/272 - loss 0.04965981 - time (sec): 9.77 - samples/sec: 3216.01 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 20:36:52,202 epoch 4 - iter 189/272 - loss 0.04801220 - time (sec): 11.38 - samples/sec: 3204.18 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 20:36:53,706 epoch 4 - iter 216/272 - loss 0.05445866 - time (sec): 12.88 - samples/sec: 3206.04 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 20:36:55,245 epoch 4 - iter 243/272 - loss 0.05136711 - time (sec): 14.42 - samples/sec: 3239.02 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 20:36:56,844 epoch 4 - iter 270/272 - loss 0.05077823 - time (sec): 16.02 - samples/sec: 3228.42 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 20:36:56,937 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:36:56,938 EPOCH 4 done: loss 0.0508 - lr: 0.000033 |
|
2023-10-17 20:36:58,390 DEV : loss 0.13528288900852203 - f1-score (micro avg) 0.8147 |
|
2023-10-17 20:36:58,397 saving best model |
|
2023-10-17 20:36:58,871 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:37:00,484 epoch 5 - iter 27/272 - loss 0.03017379 - time (sec): 1.61 - samples/sec: 3858.30 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 20:37:02,069 epoch 5 - iter 54/272 - loss 0.02969324 - time (sec): 3.19 - samples/sec: 3559.62 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 20:37:03,580 epoch 5 - iter 81/272 - loss 0.02806640 - time (sec): 4.70 - samples/sec: 3429.18 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 20:37:05,219 epoch 5 - iter 108/272 - loss 0.02899612 - time (sec): 6.34 - samples/sec: 3469.88 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 20:37:06,844 epoch 5 - iter 135/272 - loss 0.03366359 - time (sec): 7.97 - samples/sec: 3427.13 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 20:37:08,284 epoch 5 - iter 162/272 - loss 0.03423545 - time (sec): 9.41 - samples/sec: 3410.15 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 20:37:09,816 epoch 5 - iter 189/272 - loss 0.03373938 - time (sec): 10.94 - samples/sec: 3363.86 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:37:11,333 epoch 5 - iter 216/272 - loss 0.03476061 - time (sec): 12.46 - samples/sec: 3356.83 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:37:12,963 epoch 5 - iter 243/272 - loss 0.03395368 - time (sec): 14.09 - samples/sec: 3363.43 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:37:14,464 epoch 5 - iter 270/272 - loss 0.04253436 - time (sec): 15.59 - samples/sec: 3328.97 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:37:14,546 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:37:14,546 EPOCH 5 done: loss 0.0425 - lr: 0.000028 |
|
2023-10-17 20:37:15,988 DEV : loss 0.14397269487380981 - f1-score (micro avg) 0.7832 |
|
2023-10-17 20:37:15,992 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:37:17,561 epoch 6 - iter 27/272 - loss 0.01969757 - time (sec): 1.57 - samples/sec: 3315.39 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:37:19,006 epoch 6 - iter 54/272 - loss 0.03014094 - time (sec): 3.01 - samples/sec: 3327.53 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:37:20,630 epoch 6 - iter 81/272 - loss 0.03118706 - time (sec): 4.64 - samples/sec: 3344.28 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:37:22,156 epoch 6 - iter 108/272 - loss 0.02628473 - time (sec): 6.16 - samples/sec: 3337.23 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:37:23,748 epoch 6 - iter 135/272 - loss 0.02879644 - time (sec): 7.75 - samples/sec: 3331.55 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:37:25,439 epoch 6 - iter 162/272 - loss 0.03251347 - time (sec): 9.45 - samples/sec: 3352.29 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:37:26,942 epoch 6 - iter 189/272 - loss 0.02996413 - time (sec): 10.95 - samples/sec: 3308.74 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:37:28,634 epoch 6 - iter 216/272 - loss 0.03083279 - time (sec): 12.64 - samples/sec: 3330.28 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:37:30,187 epoch 6 - iter 243/272 - loss 0.03011500 - time (sec): 14.19 - samples/sec: 3321.42 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:37:31,676 epoch 6 - iter 270/272 - loss 0.03042002 - time (sec): 15.68 - samples/sec: 3303.36 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 20:37:31,769 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:37:31,769 EPOCH 6 done: loss 0.0305 - lr: 0.000022 |
|
2023-10-17 20:37:33,267 DEV : loss 0.1813816875219345 - f1-score (micro avg) 0.8216 |
|
2023-10-17 20:37:33,272 saving best model |
|
2023-10-17 20:37:33,743 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:37:35,243 epoch 7 - iter 27/272 - loss 0.00311689 - time (sec): 1.50 - samples/sec: 3342.37 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 20:37:36,976 epoch 7 - iter 54/272 - loss 0.00902378 - time (sec): 3.23 - samples/sec: 3550.53 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:37:38,510 epoch 7 - iter 81/272 - loss 0.01197238 - time (sec): 4.76 - samples/sec: 3358.10 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:37:40,035 epoch 7 - iter 108/272 - loss 0.01209143 - time (sec): 6.29 - samples/sec: 3363.76 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:37:41,487 epoch 7 - iter 135/272 - loss 0.01218218 - time (sec): 7.74 - samples/sec: 3329.24 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:37:43,186 epoch 7 - iter 162/272 - loss 0.01413961 - time (sec): 9.44 - samples/sec: 3362.95 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:37:44,691 epoch 7 - iter 189/272 - loss 0.01710511 - time (sec): 10.94 - samples/sec: 3371.01 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:37:46,306 epoch 7 - iter 216/272 - loss 0.01837378 - time (sec): 12.56 - samples/sec: 3350.44 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:37:47,848 epoch 7 - iter 243/272 - loss 0.01850521 - time (sec): 14.10 - samples/sec: 3329.51 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 20:37:49,464 epoch 7 - iter 270/272 - loss 0.01763455 - time (sec): 15.72 - samples/sec: 3301.17 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 20:37:49,550 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:37:49,550 EPOCH 7 done: loss 0.0176 - lr: 0.000017 |
|
2023-10-17 20:37:51,187 DEV : loss 0.18215733766555786 - f1-score (micro avg) 0.8037 |
|
2023-10-17 20:37:51,192 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:37:52,690 epoch 8 - iter 27/272 - loss 0.01552463 - time (sec): 1.50 - samples/sec: 3263.58 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:37:54,196 epoch 8 - iter 54/272 - loss 0.00989122 - time (sec): 3.00 - samples/sec: 3301.49 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:37:55,746 epoch 8 - iter 81/272 - loss 0.01267319 - time (sec): 4.55 - samples/sec: 3221.60 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:37:57,330 epoch 8 - iter 108/272 - loss 0.01198370 - time (sec): 6.14 - samples/sec: 3221.88 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:37:58,982 epoch 8 - iter 135/272 - loss 0.01115762 - time (sec): 7.79 - samples/sec: 3243.91 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:38:00,732 epoch 8 - iter 162/272 - loss 0.01099956 - time (sec): 9.54 - samples/sec: 3279.51 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:38:02,228 epoch 8 - iter 189/272 - loss 0.01195079 - time (sec): 11.04 - samples/sec: 3271.08 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:38:03,729 epoch 8 - iter 216/272 - loss 0.01185353 - time (sec): 12.54 - samples/sec: 3303.36 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:38:05,278 epoch 8 - iter 243/272 - loss 0.01230926 - time (sec): 14.09 - samples/sec: 3304.17 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:38:06,891 epoch 8 - iter 270/272 - loss 0.01221482 - time (sec): 15.70 - samples/sec: 3304.09 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:38:06,982 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:38:06,982 EPOCH 8 done: loss 0.0122 - lr: 0.000011 |
|
2023-10-17 20:38:08,440 DEV : loss 0.1936648190021515 - f1-score (micro avg) 0.8052 |
|
2023-10-17 20:38:08,444 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:38:10,017 epoch 9 - iter 27/272 - loss 0.00449892 - time (sec): 1.57 - samples/sec: 3347.65 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:38:11,746 epoch 9 - iter 54/272 - loss 0.00909439 - time (sec): 3.30 - samples/sec: 3482.19 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:38:13,402 epoch 9 - iter 81/272 - loss 0.00656992 - time (sec): 4.96 - samples/sec: 3312.03 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:38:14,871 epoch 9 - iter 108/272 - loss 0.00738600 - time (sec): 6.43 - samples/sec: 3221.32 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:38:16,401 epoch 9 - iter 135/272 - loss 0.00889789 - time (sec): 7.96 - samples/sec: 3289.28 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:38:18,072 epoch 9 - iter 162/272 - loss 0.00867565 - time (sec): 9.63 - samples/sec: 3306.93 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:38:19,720 epoch 9 - iter 189/272 - loss 0.00832457 - time (sec): 11.27 - samples/sec: 3300.78 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:38:21,169 epoch 9 - iter 216/272 - loss 0.00858442 - time (sec): 12.72 - samples/sec: 3262.07 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:38:22,806 epoch 9 - iter 243/272 - loss 0.00797786 - time (sec): 14.36 - samples/sec: 3310.65 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:38:24,281 epoch 9 - iter 270/272 - loss 0.00877223 - time (sec): 15.84 - samples/sec: 3269.89 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:38:24,383 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:38:24,384 EPOCH 9 done: loss 0.0087 - lr: 0.000006 |
|
2023-10-17 20:38:25,848 DEV : loss 0.1992143839597702 - f1-score (micro avg) 0.8148 |
|
2023-10-17 20:38:25,853 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:38:27,426 epoch 10 - iter 27/272 - loss 0.00323059 - time (sec): 1.57 - samples/sec: 3549.69 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:38:28,948 epoch 10 - iter 54/272 - loss 0.00272895 - time (sec): 3.09 - samples/sec: 3407.43 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:38:30,546 epoch 10 - iter 81/272 - loss 0.00227762 - time (sec): 4.69 - samples/sec: 3301.03 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:38:32,023 epoch 10 - iter 108/272 - loss 0.00557187 - time (sec): 6.17 - samples/sec: 3189.77 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:38:33,595 epoch 10 - iter 135/272 - loss 0.00503865 - time (sec): 7.74 - samples/sec: 3199.64 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:38:35,205 epoch 10 - iter 162/272 - loss 0.00627907 - time (sec): 9.35 - samples/sec: 3222.14 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:38:36,745 epoch 10 - iter 189/272 - loss 0.00668464 - time (sec): 10.89 - samples/sec: 3228.32 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:38:38,410 epoch 10 - iter 216/272 - loss 0.00681670 - time (sec): 12.56 - samples/sec: 3263.15 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:38:40,150 epoch 10 - iter 243/272 - loss 0.00634062 - time (sec): 14.30 - samples/sec: 3289.49 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:38:41,537 epoch 10 - iter 270/272 - loss 0.00596329 - time (sec): 15.68 - samples/sec: 3296.25 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 20:38:41,648 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:38:41,648 EPOCH 10 done: loss 0.0059 - lr: 0.000000 |
|
2023-10-17 20:38:43,092 DEV : loss 0.19752152264118195 - f1-score (micro avg) 0.8148 |
|
2023-10-17 20:38:43,495 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:38:43,496 Loading model from best epoch ... |
|
2023-10-17 20:38:45,082 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-17 20:38:47,291 |
|
Results: |
|
- F-score (micro) 0.7871 |
|
- F-score (macro) 0.7579 |
|
- Accuracy 0.664 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8399 0.8237 0.8317 312 |
|
PER 0.7027 0.8750 0.7794 208 |
|
ORG 0.5179 0.5273 0.5225 55 |
|
HumanProd 0.8148 1.0000 0.8980 22 |
|
|
|
micro avg 0.7562 0.8208 0.7871 597 |
|
macro avg 0.7188 0.8065 0.7579 597 |
|
weighted avg 0.7615 0.8208 0.7875 597 |
|
|
|
2023-10-17 20:38:47,291 ---------------------------------------------------------------------------------------------------- |
|
|