|
2023-10-16 19:38:25,064 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:38:25,065 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 19:38:25,065 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:38:25,065 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-16 19:38:25,065 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:38:25,065 Train: 1085 sentences |
|
2023-10-16 19:38:25,065 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 19:38:25,065 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:38:25,065 Training Params: |
|
2023-10-16 19:38:25,066 - learning_rate: "5e-05" |
|
2023-10-16 19:38:25,066 - mini_batch_size: "8" |
|
2023-10-16 19:38:25,066 - max_epochs: "10" |
|
2023-10-16 19:38:25,066 - shuffle: "True" |
|
2023-10-16 19:38:25,066 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:38:25,066 Plugins: |
|
2023-10-16 19:38:25,066 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 19:38:25,066 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:38:25,066 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 19:38:25,066 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 19:38:25,066 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:38:25,066 Computation: |
|
2023-10-16 19:38:25,066 - compute on device: cuda:0 |
|
2023-10-16 19:38:25,066 - embedding storage: none |
|
2023-10-16 19:38:25,066 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:38:25,066 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-16 19:38:25,066 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:38:25,066 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:38:26,453 epoch 1 - iter 13/136 - loss 3.02415645 - time (sec): 1.39 - samples/sec: 3380.07 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 19:38:27,894 epoch 1 - iter 26/136 - loss 2.73651899 - time (sec): 2.83 - samples/sec: 3429.90 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 19:38:29,134 epoch 1 - iter 39/136 - loss 2.19292351 - time (sec): 4.07 - samples/sec: 3528.44 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 19:38:30,344 epoch 1 - iter 52/136 - loss 1.83877465 - time (sec): 5.28 - samples/sec: 3573.73 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 19:38:31,580 epoch 1 - iter 65/136 - loss 1.56468537 - time (sec): 6.51 - samples/sec: 3685.59 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 19:38:32,997 epoch 1 - iter 78/136 - loss 1.37162883 - time (sec): 7.93 - samples/sec: 3708.76 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 19:38:34,370 epoch 1 - iter 91/136 - loss 1.23231369 - time (sec): 9.30 - samples/sec: 3698.96 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 19:38:35,609 epoch 1 - iter 104/136 - loss 1.12236078 - time (sec): 10.54 - samples/sec: 3723.21 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 19:38:37,007 epoch 1 - iter 117/136 - loss 1.02932350 - time (sec): 11.94 - samples/sec: 3727.18 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 19:38:38,585 epoch 1 - iter 130/136 - loss 0.94617700 - time (sec): 13.52 - samples/sec: 3675.25 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 19:38:39,187 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:38:39,187 EPOCH 1 done: loss 0.9161 - lr: 0.000047 |
|
2023-10-16 19:38:40,204 DEV : loss 0.20288820564746857 - f1-score (micro avg) 0.4722 |
|
2023-10-16 19:38:40,208 saving best model |
|
2023-10-16 19:38:40,522 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:38:42,045 epoch 2 - iter 13/136 - loss 0.23732824 - time (sec): 1.52 - samples/sec: 3734.23 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-16 19:38:43,375 epoch 2 - iter 26/136 - loss 0.20696770 - time (sec): 2.85 - samples/sec: 3603.59 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 19:38:44,679 epoch 2 - iter 39/136 - loss 0.19951300 - time (sec): 4.16 - samples/sec: 3714.36 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 19:38:46,063 epoch 2 - iter 52/136 - loss 0.21957384 - time (sec): 5.54 - samples/sec: 3616.90 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 19:38:47,354 epoch 2 - iter 65/136 - loss 0.21017221 - time (sec): 6.83 - samples/sec: 3618.48 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 19:38:48,776 epoch 2 - iter 78/136 - loss 0.19876748 - time (sec): 8.25 - samples/sec: 3574.30 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 19:38:50,182 epoch 2 - iter 91/136 - loss 0.19154341 - time (sec): 9.66 - samples/sec: 3605.84 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 19:38:51,554 epoch 2 - iter 104/136 - loss 0.18885502 - time (sec): 11.03 - samples/sec: 3630.47 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 19:38:52,936 epoch 2 - iter 117/136 - loss 0.18113803 - time (sec): 12.41 - samples/sec: 3630.23 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 19:38:54,386 epoch 2 - iter 130/136 - loss 0.17639998 - time (sec): 13.86 - samples/sec: 3592.81 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 19:38:55,007 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:38:55,007 EPOCH 2 done: loss 0.1744 - lr: 0.000045 |
|
2023-10-16 19:38:56,455 DEV : loss 0.12787802517414093 - f1-score (micro avg) 0.709 |
|
2023-10-16 19:38:56,462 saving best model |
|
2023-10-16 19:38:56,980 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:38:58,375 epoch 3 - iter 13/136 - loss 0.10130091 - time (sec): 1.39 - samples/sec: 3458.50 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-16 19:38:59,508 epoch 3 - iter 26/136 - loss 0.09604338 - time (sec): 2.52 - samples/sec: 3682.54 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 19:39:00,997 epoch 3 - iter 39/136 - loss 0.10268479 - time (sec): 4.01 - samples/sec: 3713.63 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 19:39:02,517 epoch 3 - iter 52/136 - loss 0.10309567 - time (sec): 5.53 - samples/sec: 3542.78 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 19:39:03,822 epoch 3 - iter 65/136 - loss 0.09646645 - time (sec): 6.84 - samples/sec: 3534.59 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 19:39:05,342 epoch 3 - iter 78/136 - loss 0.09885337 - time (sec): 8.36 - samples/sec: 3539.54 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 19:39:06,891 epoch 3 - iter 91/136 - loss 0.09499064 - time (sec): 9.91 - samples/sec: 3475.54 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 19:39:08,202 epoch 3 - iter 104/136 - loss 0.09164222 - time (sec): 11.22 - samples/sec: 3466.50 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 19:39:09,616 epoch 3 - iter 117/136 - loss 0.09539987 - time (sec): 12.63 - samples/sec: 3460.71 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 19:39:11,090 epoch 3 - iter 130/136 - loss 0.09368696 - time (sec): 14.11 - samples/sec: 3501.20 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-16 19:39:11,799 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:39:11,799 EPOCH 3 done: loss 0.0936 - lr: 0.000039 |
|
2023-10-16 19:39:13,630 DEV : loss 0.10517842322587967 - f1-score (micro avg) 0.7648 |
|
2023-10-16 19:39:13,634 saving best model |
|
2023-10-16 19:39:14,303 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:39:15,796 epoch 4 - iter 13/136 - loss 0.06608433 - time (sec): 1.49 - samples/sec: 3677.44 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 19:39:17,085 epoch 4 - iter 26/136 - loss 0.05710389 - time (sec): 2.78 - samples/sec: 3836.83 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 19:39:18,540 epoch 4 - iter 39/136 - loss 0.05467172 - time (sec): 4.23 - samples/sec: 3719.82 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 19:39:20,049 epoch 4 - iter 52/136 - loss 0.05580848 - time (sec): 5.74 - samples/sec: 3608.35 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 19:39:21,509 epoch 4 - iter 65/136 - loss 0.05256870 - time (sec): 7.20 - samples/sec: 3558.38 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 19:39:22,925 epoch 4 - iter 78/136 - loss 0.05445472 - time (sec): 8.62 - samples/sec: 3545.68 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 19:39:24,676 epoch 4 - iter 91/136 - loss 0.05306817 - time (sec): 10.37 - samples/sec: 3511.57 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 19:39:25,972 epoch 4 - iter 104/136 - loss 0.05204892 - time (sec): 11.66 - samples/sec: 3509.49 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 19:39:27,239 epoch 4 - iter 117/136 - loss 0.04955030 - time (sec): 12.93 - samples/sec: 3519.63 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 19:39:28,688 epoch 4 - iter 130/136 - loss 0.05090082 - time (sec): 14.38 - samples/sec: 3482.74 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 19:39:29,277 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:39:29,277 EPOCH 4 done: loss 0.0503 - lr: 0.000034 |
|
2023-10-16 19:39:30,742 DEV : loss 0.11859514564275742 - f1-score (micro avg) 0.7751 |
|
2023-10-16 19:39:30,746 saving best model |
|
2023-10-16 19:39:31,279 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:39:32,725 epoch 5 - iter 13/136 - loss 0.05288570 - time (sec): 1.44 - samples/sec: 3341.66 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 19:39:34,235 epoch 5 - iter 26/136 - loss 0.03888893 - time (sec): 2.95 - samples/sec: 3360.97 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 19:39:35,703 epoch 5 - iter 39/136 - loss 0.03461814 - time (sec): 4.42 - samples/sec: 3491.55 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 19:39:37,270 epoch 5 - iter 52/136 - loss 0.03465037 - time (sec): 5.99 - samples/sec: 3451.16 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 19:39:38,489 epoch 5 - iter 65/136 - loss 0.03740903 - time (sec): 7.21 - samples/sec: 3564.80 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 19:39:39,872 epoch 5 - iter 78/136 - loss 0.03524836 - time (sec): 8.59 - samples/sec: 3525.38 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 19:39:41,208 epoch 5 - iter 91/136 - loss 0.03585208 - time (sec): 9.92 - samples/sec: 3537.93 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 19:39:42,760 epoch 5 - iter 104/136 - loss 0.03429363 - time (sec): 11.48 - samples/sec: 3529.25 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 19:39:44,131 epoch 5 - iter 117/136 - loss 0.03278488 - time (sec): 12.85 - samples/sec: 3543.05 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 19:39:45,521 epoch 5 - iter 130/136 - loss 0.03204268 - time (sec): 14.24 - samples/sec: 3549.25 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 19:39:45,959 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:39:45,959 EPOCH 5 done: loss 0.0324 - lr: 0.000028 |
|
2023-10-16 19:39:47,752 DEV : loss 0.12475510686635971 - f1-score (micro avg) 0.8214 |
|
2023-10-16 19:39:47,756 saving best model |
|
2023-10-16 19:39:48,262 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:39:49,651 epoch 6 - iter 13/136 - loss 0.01579763 - time (sec): 1.39 - samples/sec: 3361.70 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 19:39:50,851 epoch 6 - iter 26/136 - loss 0.02031476 - time (sec): 2.59 - samples/sec: 3594.29 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 19:39:52,223 epoch 6 - iter 39/136 - loss 0.02508221 - time (sec): 3.96 - samples/sec: 3588.18 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 19:39:53,795 epoch 6 - iter 52/136 - loss 0.02385107 - time (sec): 5.53 - samples/sec: 3554.23 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 19:39:55,038 epoch 6 - iter 65/136 - loss 0.02550333 - time (sec): 6.77 - samples/sec: 3691.72 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 19:39:56,455 epoch 6 - iter 78/136 - loss 0.02488529 - time (sec): 8.19 - samples/sec: 3573.47 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 19:39:57,884 epoch 6 - iter 91/136 - loss 0.02306815 - time (sec): 9.62 - samples/sec: 3522.48 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 19:39:59,503 epoch 6 - iter 104/136 - loss 0.02319266 - time (sec): 11.24 - samples/sec: 3504.24 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 19:40:00,962 epoch 6 - iter 117/136 - loss 0.02205655 - time (sec): 12.70 - samples/sec: 3499.80 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 19:40:02,319 epoch 6 - iter 130/136 - loss 0.02214229 - time (sec): 14.06 - samples/sec: 3534.47 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 19:40:02,991 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:40:02,992 EPOCH 6 done: loss 0.0226 - lr: 0.000023 |
|
2023-10-16 19:40:04,414 DEV : loss 0.1283072680234909 - f1-score (micro avg) 0.8 |
|
2023-10-16 19:40:04,418 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:40:05,810 epoch 7 - iter 13/136 - loss 0.02517646 - time (sec): 1.39 - samples/sec: 4173.17 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 19:40:07,200 epoch 7 - iter 26/136 - loss 0.01976096 - time (sec): 2.78 - samples/sec: 3751.25 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 19:40:08,558 epoch 7 - iter 39/136 - loss 0.01871697 - time (sec): 4.14 - samples/sec: 3810.94 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 19:40:09,860 epoch 7 - iter 52/136 - loss 0.01928305 - time (sec): 5.44 - samples/sec: 3807.28 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 19:40:11,172 epoch 7 - iter 65/136 - loss 0.01756808 - time (sec): 6.75 - samples/sec: 3750.80 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 19:40:12,582 epoch 7 - iter 78/136 - loss 0.01776526 - time (sec): 8.16 - samples/sec: 3725.04 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 19:40:14,129 epoch 7 - iter 91/136 - loss 0.01667503 - time (sec): 9.71 - samples/sec: 3720.31 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 19:40:15,364 epoch 7 - iter 104/136 - loss 0.01663346 - time (sec): 10.94 - samples/sec: 3705.00 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 19:40:16,770 epoch 7 - iter 117/136 - loss 0.01670474 - time (sec): 12.35 - samples/sec: 3681.72 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 19:40:18,168 epoch 7 - iter 130/136 - loss 0.01720601 - time (sec): 13.75 - samples/sec: 3642.58 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 19:40:18,768 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:40:18,768 EPOCH 7 done: loss 0.0174 - lr: 0.000017 |
|
2023-10-16 19:40:20,403 DEV : loss 0.14328120648860931 - f1-score (micro avg) 0.7899 |
|
2023-10-16 19:40:20,407 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:40:21,913 epoch 8 - iter 13/136 - loss 0.00578552 - time (sec): 1.50 - samples/sec: 3549.97 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 19:40:23,532 epoch 8 - iter 26/136 - loss 0.00802889 - time (sec): 3.12 - samples/sec: 3436.77 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 19:40:24,845 epoch 8 - iter 39/136 - loss 0.00991238 - time (sec): 4.44 - samples/sec: 3553.05 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 19:40:26,046 epoch 8 - iter 52/136 - loss 0.00922150 - time (sec): 5.64 - samples/sec: 3555.06 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 19:40:27,545 epoch 8 - iter 65/136 - loss 0.01032178 - time (sec): 7.14 - samples/sec: 3626.78 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 19:40:28,943 epoch 8 - iter 78/136 - loss 0.01028200 - time (sec): 8.54 - samples/sec: 3589.22 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 19:40:30,503 epoch 8 - iter 91/136 - loss 0.01074870 - time (sec): 10.09 - samples/sec: 3592.90 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 19:40:32,046 epoch 8 - iter 104/136 - loss 0.00989424 - time (sec): 11.64 - samples/sec: 3558.63 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 19:40:33,390 epoch 8 - iter 117/136 - loss 0.01118879 - time (sec): 12.98 - samples/sec: 3511.74 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 19:40:34,696 epoch 8 - iter 130/136 - loss 0.01176432 - time (sec): 14.29 - samples/sec: 3480.28 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 19:40:35,376 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:40:35,376 EPOCH 8 done: loss 0.0117 - lr: 0.000012 |
|
2023-10-16 19:40:36,803 DEV : loss 0.15624405443668365 - f1-score (micro avg) 0.8133 |
|
2023-10-16 19:40:36,807 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:40:38,141 epoch 9 - iter 13/136 - loss 0.00928382 - time (sec): 1.33 - samples/sec: 3710.24 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 19:40:39,597 epoch 9 - iter 26/136 - loss 0.01339997 - time (sec): 2.79 - samples/sec: 3706.34 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 19:40:41,087 epoch 9 - iter 39/136 - loss 0.01199908 - time (sec): 4.28 - samples/sec: 3708.38 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 19:40:42,388 epoch 9 - iter 52/136 - loss 0.01266300 - time (sec): 5.58 - samples/sec: 3762.51 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 19:40:43,771 epoch 9 - iter 65/136 - loss 0.01242176 - time (sec): 6.96 - samples/sec: 3686.38 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 19:40:45,109 epoch 9 - iter 78/136 - loss 0.01236743 - time (sec): 8.30 - samples/sec: 3581.29 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 19:40:46,647 epoch 9 - iter 91/136 - loss 0.01228619 - time (sec): 9.84 - samples/sec: 3554.07 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 19:40:47,911 epoch 9 - iter 104/136 - loss 0.01090571 - time (sec): 11.10 - samples/sec: 3580.06 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 19:40:49,656 epoch 9 - iter 117/136 - loss 0.01047892 - time (sec): 12.85 - samples/sec: 3517.41 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 19:40:51,014 epoch 9 - iter 130/136 - loss 0.00993445 - time (sec): 14.21 - samples/sec: 3532.53 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 19:40:51,564 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:40:51,564 EPOCH 9 done: loss 0.0099 - lr: 0.000006 |
|
2023-10-16 19:40:53,004 DEV : loss 0.15900768339633942 - f1-score (micro avg) 0.8088 |
|
2023-10-16 19:40:53,008 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:40:55,022 epoch 10 - iter 13/136 - loss 0.00709288 - time (sec): 2.01 - samples/sec: 2608.60 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 19:40:56,538 epoch 10 - iter 26/136 - loss 0.00518710 - time (sec): 3.53 - samples/sec: 2992.37 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 19:40:57,869 epoch 10 - iter 39/136 - loss 0.00781429 - time (sec): 4.86 - samples/sec: 3046.30 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 19:40:59,220 epoch 10 - iter 52/136 - loss 0.00899035 - time (sec): 6.21 - samples/sec: 3172.27 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 19:41:00,601 epoch 10 - iter 65/136 - loss 0.00817988 - time (sec): 7.59 - samples/sec: 3212.55 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 19:41:02,099 epoch 10 - iter 78/136 - loss 0.00777940 - time (sec): 9.09 - samples/sec: 3226.67 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 19:41:03,454 epoch 10 - iter 91/136 - loss 0.00719324 - time (sec): 10.44 - samples/sec: 3245.39 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 19:41:04,870 epoch 10 - iter 104/136 - loss 0.00693415 - time (sec): 11.86 - samples/sec: 3312.65 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 19:41:06,420 epoch 10 - iter 117/136 - loss 0.00767387 - time (sec): 13.41 - samples/sec: 3353.14 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 19:41:07,740 epoch 10 - iter 130/136 - loss 0.00777585 - time (sec): 14.73 - samples/sec: 3398.50 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 19:41:08,252 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:41:08,252 EPOCH 10 done: loss 0.0076 - lr: 0.000000 |
|
2023-10-16 19:41:09,680 DEV : loss 0.16889716684818268 - f1-score (micro avg) 0.8103 |
|
2023-10-16 19:41:10,080 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:41:10,081 Loading model from best epoch ... |
|
2023-10-16 19:41:11,797 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-16 19:41:13,818 |
|
Results: |
|
- F-score (micro) 0.7764 |
|
- F-score (macro) 0.7367 |
|
- Accuracy 0.651 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7971 0.8942 0.8429 312 |
|
PER 0.6439 0.8606 0.7366 208 |
|
ORG 0.5366 0.4000 0.4583 55 |
|
HumanProd 0.9091 0.9091 0.9091 22 |
|
|
|
micro avg 0.7236 0.8375 0.7764 597 |
|
macro avg 0.7217 0.7660 0.7367 597 |
|
weighted avg 0.7239 0.8375 0.7729 597 |
|
|
|
2023-10-16 19:41:13,818 ---------------------------------------------------------------------------------------------------- |
|
|