|
2023-10-17 19:51:52,385 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:51:52,386 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 19:51:52,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:51:52,386 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-17 19:51:52,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:51:52,386 Train: 1085 sentences |
|
2023-10-17 19:51:52,386 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 19:51:52,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:51:52,386 Training Params: |
|
2023-10-17 19:51:52,386 - learning_rate: "3e-05" |
|
2023-10-17 19:51:52,386 - mini_batch_size: "4" |
|
2023-10-17 19:51:52,386 - max_epochs: "10" |
|
2023-10-17 19:51:52,386 - shuffle: "True" |
|
2023-10-17 19:51:52,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:51:52,386 Plugins: |
|
2023-10-17 19:51:52,386 - TensorboardLogger |
|
2023-10-17 19:51:52,386 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 19:51:52,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:51:52,386 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 19:51:52,386 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 19:51:52,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:51:52,386 Computation: |
|
2023-10-17 19:51:52,386 - compute on device: cuda:0 |
|
2023-10-17 19:51:52,386 - embedding storage: none |
|
2023-10-17 19:51:52,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:51:52,387 Model training base path: "hmbench-newseye/sv-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-17 19:51:52,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:51:52,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:51:52,387 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 19:51:53,991 epoch 1 - iter 27/272 - loss 3.68614455 - time (sec): 1.60 - samples/sec: 3013.94 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 19:51:55,492 epoch 1 - iter 54/272 - loss 3.27911665 - time (sec): 3.10 - samples/sec: 2936.58 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 19:51:57,102 epoch 1 - iter 81/272 - loss 2.42782165 - time (sec): 4.71 - samples/sec: 3172.39 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 19:51:58,619 epoch 1 - iter 108/272 - loss 1.93547446 - time (sec): 6.23 - samples/sec: 3222.39 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 19:52:00,174 epoch 1 - iter 135/272 - loss 1.66887966 - time (sec): 7.79 - samples/sec: 3167.82 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 19:52:01,759 epoch 1 - iter 162/272 - loss 1.43150079 - time (sec): 9.37 - samples/sec: 3235.06 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 19:52:03,295 epoch 1 - iter 189/272 - loss 1.27708928 - time (sec): 10.91 - samples/sec: 3248.58 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 19:52:05,064 epoch 1 - iter 216/272 - loss 1.11895645 - time (sec): 12.68 - samples/sec: 3302.85 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 19:52:06,572 epoch 1 - iter 243/272 - loss 1.03839737 - time (sec): 14.18 - samples/sec: 3295.24 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 19:52:08,173 epoch 1 - iter 270/272 - loss 0.95699653 - time (sec): 15.78 - samples/sec: 3280.80 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 19:52:08,284 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:52:08,285 EPOCH 1 done: loss 0.9543 - lr: 0.000030 |
|
2023-10-17 19:52:09,325 DEV : loss 0.17717474699020386 - f1-score (micro avg) 0.5914 |
|
2023-10-17 19:52:09,329 saving best model |
|
2023-10-17 19:52:09,689 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:52:11,253 epoch 2 - iter 27/272 - loss 0.16203503 - time (sec): 1.56 - samples/sec: 3304.40 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 19:52:12,809 epoch 2 - iter 54/272 - loss 0.17472666 - time (sec): 3.12 - samples/sec: 3375.93 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 19:52:14,492 epoch 2 - iter 81/272 - loss 0.18110744 - time (sec): 4.80 - samples/sec: 3347.54 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 19:52:16,126 epoch 2 - iter 108/272 - loss 0.17792809 - time (sec): 6.44 - samples/sec: 3312.91 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 19:52:17,559 epoch 2 - iter 135/272 - loss 0.17288065 - time (sec): 7.87 - samples/sec: 3279.42 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 19:52:19,203 epoch 2 - iter 162/272 - loss 0.18156776 - time (sec): 9.51 - samples/sec: 3278.49 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 19:52:20,692 epoch 2 - iter 189/272 - loss 0.17360395 - time (sec): 11.00 - samples/sec: 3239.67 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 19:52:22,312 epoch 2 - iter 216/272 - loss 0.16404397 - time (sec): 12.62 - samples/sec: 3275.19 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 19:52:23,951 epoch 2 - iter 243/272 - loss 0.15732808 - time (sec): 14.26 - samples/sec: 3265.49 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 19:52:25,422 epoch 2 - iter 270/272 - loss 0.15548570 - time (sec): 15.73 - samples/sec: 3287.26 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 19:52:25,512 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:52:25,513 EPOCH 2 done: loss 0.1553 - lr: 0.000027 |
|
2023-10-17 19:52:26,935 DEV : loss 0.1165740042924881 - f1-score (micro avg) 0.7687 |
|
2023-10-17 19:52:26,939 saving best model |
|
2023-10-17 19:52:27,426 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:52:29,141 epoch 3 - iter 27/272 - loss 0.07582860 - time (sec): 1.71 - samples/sec: 3187.78 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 19:52:30,790 epoch 3 - iter 54/272 - loss 0.08158520 - time (sec): 3.36 - samples/sec: 3334.75 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 19:52:32,345 epoch 3 - iter 81/272 - loss 0.08034283 - time (sec): 4.92 - samples/sec: 3356.99 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 19:52:33,932 epoch 3 - iter 108/272 - loss 0.08966396 - time (sec): 6.50 - samples/sec: 3372.91 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 19:52:35,546 epoch 3 - iter 135/272 - loss 0.08284123 - time (sec): 8.12 - samples/sec: 3357.93 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 19:52:37,046 epoch 3 - iter 162/272 - loss 0.08763320 - time (sec): 9.62 - samples/sec: 3337.11 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 19:52:38,616 epoch 3 - iter 189/272 - loss 0.08674857 - time (sec): 11.19 - samples/sec: 3323.91 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 19:52:40,057 epoch 3 - iter 216/272 - loss 0.08622983 - time (sec): 12.63 - samples/sec: 3291.27 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 19:52:41,676 epoch 3 - iter 243/272 - loss 0.08383335 - time (sec): 14.25 - samples/sec: 3304.56 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 19:52:43,140 epoch 3 - iter 270/272 - loss 0.08380410 - time (sec): 15.71 - samples/sec: 3298.30 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 19:52:43,229 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:52:43,229 EPOCH 3 done: loss 0.0836 - lr: 0.000023 |
|
2023-10-17 19:52:44,666 DEV : loss 0.11597966402769089 - f1-score (micro avg) 0.7549 |
|
2023-10-17 19:52:44,671 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:52:46,238 epoch 4 - iter 27/272 - loss 0.03817143 - time (sec): 1.57 - samples/sec: 3114.22 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 19:52:47,801 epoch 4 - iter 54/272 - loss 0.03346733 - time (sec): 3.13 - samples/sec: 3175.76 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 19:52:49,482 epoch 4 - iter 81/272 - loss 0.04517407 - time (sec): 4.81 - samples/sec: 3294.02 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 19:52:50,892 epoch 4 - iter 108/272 - loss 0.04319358 - time (sec): 6.22 - samples/sec: 3243.83 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 19:52:52,431 epoch 4 - iter 135/272 - loss 0.04534533 - time (sec): 7.76 - samples/sec: 3249.82 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 19:52:54,112 epoch 4 - iter 162/272 - loss 0.04889301 - time (sec): 9.44 - samples/sec: 3285.66 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 19:52:55,709 epoch 4 - iter 189/272 - loss 0.05064160 - time (sec): 11.04 - samples/sec: 3265.73 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 19:52:57,341 epoch 4 - iter 216/272 - loss 0.05215477 - time (sec): 12.67 - samples/sec: 3256.61 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 19:52:58,816 epoch 4 - iter 243/272 - loss 0.05121743 - time (sec): 14.14 - samples/sec: 3280.18 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 19:53:00,346 epoch 4 - iter 270/272 - loss 0.05390532 - time (sec): 15.67 - samples/sec: 3301.63 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 19:53:00,433 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:53:00,434 EPOCH 4 done: loss 0.0540 - lr: 0.000020 |
|
2023-10-17 19:53:01,866 DEV : loss 0.1187073215842247 - f1-score (micro avg) 0.7993 |
|
2023-10-17 19:53:01,870 saving best model |
|
2023-10-17 19:53:02,345 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:53:03,909 epoch 5 - iter 27/272 - loss 0.04055481 - time (sec): 1.56 - samples/sec: 3484.10 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 19:53:05,449 epoch 5 - iter 54/272 - loss 0.04184600 - time (sec): 3.10 - samples/sec: 3498.71 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 19:53:06,981 epoch 5 - iter 81/272 - loss 0.03728394 - time (sec): 4.63 - samples/sec: 3467.31 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 19:53:08,584 epoch 5 - iter 108/272 - loss 0.03770208 - time (sec): 6.23 - samples/sec: 3376.47 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 19:53:10,292 epoch 5 - iter 135/272 - loss 0.03610795 - time (sec): 7.94 - samples/sec: 3291.87 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 19:53:11,877 epoch 5 - iter 162/272 - loss 0.03311822 - time (sec): 9.53 - samples/sec: 3282.79 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 19:53:13,423 epoch 5 - iter 189/272 - loss 0.03480952 - time (sec): 11.07 - samples/sec: 3272.28 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 19:53:15,042 epoch 5 - iter 216/272 - loss 0.03330190 - time (sec): 12.69 - samples/sec: 3281.77 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 19:53:16,639 epoch 5 - iter 243/272 - loss 0.03351532 - time (sec): 14.29 - samples/sec: 3239.63 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 19:53:18,245 epoch 5 - iter 270/272 - loss 0.03210995 - time (sec): 15.89 - samples/sec: 3262.48 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 19:53:18,326 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:53:18,326 EPOCH 5 done: loss 0.0321 - lr: 0.000017 |
|
2023-10-17 19:53:19,962 DEV : loss 0.1374482363462448 - f1-score (micro avg) 0.8007 |
|
2023-10-17 19:53:19,967 saving best model |
|
2023-10-17 19:53:20,434 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:53:21,874 epoch 6 - iter 27/272 - loss 0.01735343 - time (sec): 1.44 - samples/sec: 3253.81 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 19:53:23,513 epoch 6 - iter 54/272 - loss 0.03248622 - time (sec): 3.08 - samples/sec: 3203.61 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 19:53:25,105 epoch 6 - iter 81/272 - loss 0.02783744 - time (sec): 4.67 - samples/sec: 3253.09 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 19:53:26,637 epoch 6 - iter 108/272 - loss 0.02635058 - time (sec): 6.20 - samples/sec: 3270.39 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 19:53:28,205 epoch 6 - iter 135/272 - loss 0.02425374 - time (sec): 7.77 - samples/sec: 3335.38 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 19:53:29,847 epoch 6 - iter 162/272 - loss 0.02336843 - time (sec): 9.41 - samples/sec: 3393.10 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 19:53:31,441 epoch 6 - iter 189/272 - loss 0.02378891 - time (sec): 11.00 - samples/sec: 3362.54 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 19:53:32,990 epoch 6 - iter 216/272 - loss 0.02378025 - time (sec): 12.55 - samples/sec: 3323.37 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 19:53:34,525 epoch 6 - iter 243/272 - loss 0.02280147 - time (sec): 14.09 - samples/sec: 3307.37 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 19:53:36,095 epoch 6 - iter 270/272 - loss 0.02523429 - time (sec): 15.66 - samples/sec: 3308.99 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 19:53:36,194 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:53:36,194 EPOCH 6 done: loss 0.0252 - lr: 0.000013 |
|
2023-10-17 19:53:37,620 DEV : loss 0.15510904788970947 - f1-score (micro avg) 0.8015 |
|
2023-10-17 19:53:37,625 saving best model |
|
2023-10-17 19:53:38,100 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:53:39,945 epoch 7 - iter 27/272 - loss 0.02080006 - time (sec): 1.84 - samples/sec: 3386.14 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 19:53:41,440 epoch 7 - iter 54/272 - loss 0.02050320 - time (sec): 3.34 - samples/sec: 3404.08 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 19:53:42,852 epoch 7 - iter 81/272 - loss 0.01895013 - time (sec): 4.75 - samples/sec: 3325.27 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 19:53:44,336 epoch 7 - iter 108/272 - loss 0.01855506 - time (sec): 6.23 - samples/sec: 3236.28 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 19:53:45,911 epoch 7 - iter 135/272 - loss 0.01638277 - time (sec): 7.81 - samples/sec: 3220.02 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 19:53:47,494 epoch 7 - iter 162/272 - loss 0.01646283 - time (sec): 9.39 - samples/sec: 3229.98 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 19:53:49,033 epoch 7 - iter 189/272 - loss 0.01493295 - time (sec): 10.93 - samples/sec: 3265.83 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 19:53:50,573 epoch 7 - iter 216/272 - loss 0.01531514 - time (sec): 12.47 - samples/sec: 3293.68 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 19:53:52,264 epoch 7 - iter 243/272 - loss 0.01629045 - time (sec): 14.16 - samples/sec: 3275.62 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 19:53:53,909 epoch 7 - iter 270/272 - loss 0.01759728 - time (sec): 15.81 - samples/sec: 3268.76 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 19:53:54,017 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:53:54,017 EPOCH 7 done: loss 0.0175 - lr: 0.000010 |
|
2023-10-17 19:53:55,472 DEV : loss 0.1710209846496582 - f1-score (micro avg) 0.8118 |
|
2023-10-17 19:53:55,477 saving best model |
|
2023-10-17 19:53:55,954 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:53:57,507 epoch 8 - iter 27/272 - loss 0.01604160 - time (sec): 1.55 - samples/sec: 3207.60 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 19:53:59,234 epoch 8 - iter 54/272 - loss 0.01064724 - time (sec): 3.28 - samples/sec: 3353.42 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 19:54:00,786 epoch 8 - iter 81/272 - loss 0.00922370 - time (sec): 4.83 - samples/sec: 3384.65 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 19:54:02,279 epoch 8 - iter 108/272 - loss 0.01069396 - time (sec): 6.32 - samples/sec: 3298.73 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 19:54:03,890 epoch 8 - iter 135/272 - loss 0.01198809 - time (sec): 7.93 - samples/sec: 3323.78 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 19:54:05,494 epoch 8 - iter 162/272 - loss 0.01143605 - time (sec): 9.54 - samples/sec: 3313.47 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 19:54:07,237 epoch 8 - iter 189/272 - loss 0.01236243 - time (sec): 11.28 - samples/sec: 3354.86 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 19:54:08,640 epoch 8 - iter 216/272 - loss 0.01336567 - time (sec): 12.68 - samples/sec: 3310.13 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 19:54:10,128 epoch 8 - iter 243/272 - loss 0.01295151 - time (sec): 14.17 - samples/sec: 3263.95 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 19:54:11,769 epoch 8 - iter 270/272 - loss 0.01256788 - time (sec): 15.81 - samples/sec: 3273.79 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 19:54:11,859 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:54:11,859 EPOCH 8 done: loss 0.0126 - lr: 0.000007 |
|
2023-10-17 19:54:13,299 DEV : loss 0.17594939470291138 - f1-score (micro avg) 0.8118 |
|
2023-10-17 19:54:13,305 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:54:14,823 epoch 9 - iter 27/272 - loss 0.00194437 - time (sec): 1.52 - samples/sec: 3162.49 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 19:54:16,484 epoch 9 - iter 54/272 - loss 0.00302095 - time (sec): 3.18 - samples/sec: 3141.62 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 19:54:17,957 epoch 9 - iter 81/272 - loss 0.00280715 - time (sec): 4.65 - samples/sec: 3030.41 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 19:54:19,699 epoch 9 - iter 108/272 - loss 0.00750767 - time (sec): 6.39 - samples/sec: 3148.64 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 19:54:21,168 epoch 9 - iter 135/272 - loss 0.01060348 - time (sec): 7.86 - samples/sec: 3153.49 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 19:54:22,717 epoch 9 - iter 162/272 - loss 0.01005783 - time (sec): 9.41 - samples/sec: 3157.07 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 19:54:24,425 epoch 9 - iter 189/272 - loss 0.00964176 - time (sec): 11.12 - samples/sec: 3278.39 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 19:54:26,194 epoch 9 - iter 216/272 - loss 0.01037476 - time (sec): 12.89 - samples/sec: 3239.70 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 19:54:27,722 epoch 9 - iter 243/272 - loss 0.00974212 - time (sec): 14.42 - samples/sec: 3207.43 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 19:54:29,301 epoch 9 - iter 270/272 - loss 0.00880051 - time (sec): 15.99 - samples/sec: 3238.36 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 19:54:29,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:54:29,386 EPOCH 9 done: loss 0.0089 - lr: 0.000003 |
|
2023-10-17 19:54:30,831 DEV : loss 0.18299776315689087 - f1-score (micro avg) 0.8059 |
|
2023-10-17 19:54:30,836 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:54:32,339 epoch 10 - iter 27/272 - loss 0.00732078 - time (sec): 1.50 - samples/sec: 3372.08 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 19:54:33,860 epoch 10 - iter 54/272 - loss 0.00549127 - time (sec): 3.02 - samples/sec: 3236.59 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 19:54:35,365 epoch 10 - iter 81/272 - loss 0.00525623 - time (sec): 4.53 - samples/sec: 3208.88 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 19:54:36,901 epoch 10 - iter 108/272 - loss 0.00566812 - time (sec): 6.06 - samples/sec: 3298.58 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 19:54:38,576 epoch 10 - iter 135/272 - loss 0.00528647 - time (sec): 7.74 - samples/sec: 3324.83 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 19:54:40,325 epoch 10 - iter 162/272 - loss 0.00586342 - time (sec): 9.49 - samples/sec: 3329.11 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 19:54:41,862 epoch 10 - iter 189/272 - loss 0.00551525 - time (sec): 11.03 - samples/sec: 3281.65 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 19:54:43,484 epoch 10 - iter 216/272 - loss 0.00658959 - time (sec): 12.65 - samples/sec: 3263.82 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 19:54:45,242 epoch 10 - iter 243/272 - loss 0.00839720 - time (sec): 14.40 - samples/sec: 3257.94 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 19:54:46,813 epoch 10 - iter 270/272 - loss 0.00763962 - time (sec): 15.98 - samples/sec: 3242.16 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 19:54:46,910 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:54:46,911 EPOCH 10 done: loss 0.0076 - lr: 0.000000 |
|
2023-10-17 19:54:48,333 DEV : loss 0.1869087666273117 - f1-score (micro avg) 0.8067 |
|
2023-10-17 19:54:48,709 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:54:48,710 Loading model from best epoch ... |
|
2023-10-17 19:54:50,049 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-17 19:54:52,035 |
|
Results: |
|
- F-score (micro) 0.781 |
|
- F-score (macro) 0.7137 |
|
- Accuracy 0.6595 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7988 0.8526 0.8248 312 |
|
PER 0.7143 0.8654 0.7826 208 |
|
ORG 0.5778 0.4727 0.5200 55 |
|
HumanProd 0.6061 0.9091 0.7273 22 |
|
|
|
micro avg 0.7421 0.8241 0.7810 597 |
|
macro avg 0.6742 0.7749 0.7137 597 |
|
weighted avg 0.7419 0.8241 0.7784 597 |
|
|
|
2023-10-17 19:54:52,036 ---------------------------------------------------------------------------------------------------- |
|
|