|
2023-10-25 20:48:39,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:48:39,157 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 20:48:39,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:48:39,158 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-25 20:48:39,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:48:39,158 Train: 1085 sentences |
|
2023-10-25 20:48:39,158 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 20:48:39,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:48:39,158 Training Params: |
|
2023-10-25 20:48:39,158 - learning_rate: "5e-05" |
|
2023-10-25 20:48:39,158 - mini_batch_size: "8" |
|
2023-10-25 20:48:39,158 - max_epochs: "10" |
|
2023-10-25 20:48:39,158 - shuffle: "True" |
|
2023-10-25 20:48:39,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:48:39,158 Plugins: |
|
2023-10-25 20:48:39,158 - TensorboardLogger |
|
2023-10-25 20:48:39,158 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 20:48:39,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:48:39,158 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 20:48:39,158 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 20:48:39,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:48:39,158 Computation: |
|
2023-10-25 20:48:39,158 - compute on device: cuda:0 |
|
2023-10-25 20:48:39,158 - embedding storage: none |
|
2023-10-25 20:48:39,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:48:39,158 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-25 20:48:39,159 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:48:39,159 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:48:39,159 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 20:48:40,096 epoch 1 - iter 13/136 - loss 3.44621732 - time (sec): 0.94 - samples/sec: 5293.81 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 20:48:41,128 epoch 1 - iter 26/136 - loss 2.81552028 - time (sec): 1.97 - samples/sec: 5012.97 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 20:48:42,097 epoch 1 - iter 39/136 - loss 2.12788098 - time (sec): 2.94 - samples/sec: 4999.35 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 20:48:43,125 epoch 1 - iter 52/136 - loss 1.71483581 - time (sec): 3.97 - samples/sec: 4909.24 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 20:48:44,123 epoch 1 - iter 65/136 - loss 1.44019105 - time (sec): 4.96 - samples/sec: 4993.24 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 20:48:45,078 epoch 1 - iter 78/136 - loss 1.26816364 - time (sec): 5.92 - samples/sec: 4938.58 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 20:48:45,954 epoch 1 - iter 91/136 - loss 1.13162328 - time (sec): 6.79 - samples/sec: 5023.95 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 20:48:46,947 epoch 1 - iter 104/136 - loss 1.02701632 - time (sec): 7.79 - samples/sec: 4958.08 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 20:48:47,969 epoch 1 - iter 117/136 - loss 0.92414239 - time (sec): 8.81 - samples/sec: 4962.68 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 20:48:48,955 epoch 1 - iter 130/136 - loss 0.83824141 - time (sec): 9.80 - samples/sec: 5021.04 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 20:48:49,501 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:48:49,501 EPOCH 1 done: loss 0.8058 - lr: 0.000047 |
|
2023-10-25 20:48:50,630 DEV : loss 0.14142106473445892 - f1-score (micro avg) 0.6475 |
|
2023-10-25 20:48:50,637 saving best model |
|
2023-10-25 20:48:51,181 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:48:52,235 epoch 2 - iter 13/136 - loss 0.12734756 - time (sec): 1.05 - samples/sec: 5455.90 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-25 20:48:53,252 epoch 2 - iter 26/136 - loss 0.13122678 - time (sec): 2.07 - samples/sec: 5314.49 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 20:48:54,332 epoch 2 - iter 39/136 - loss 0.13430043 - time (sec): 3.15 - samples/sec: 5138.71 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 20:48:55,312 epoch 2 - iter 52/136 - loss 0.14328871 - time (sec): 4.13 - samples/sec: 5109.27 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 20:48:56,294 epoch 2 - iter 65/136 - loss 0.13905783 - time (sec): 5.11 - samples/sec: 4933.71 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 20:48:57,358 epoch 2 - iter 78/136 - loss 0.13823362 - time (sec): 6.18 - samples/sec: 4976.19 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 20:48:58,300 epoch 2 - iter 91/136 - loss 0.13445724 - time (sec): 7.12 - samples/sec: 4966.95 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 20:48:59,161 epoch 2 - iter 104/136 - loss 0.13451633 - time (sec): 7.98 - samples/sec: 4936.33 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 20:49:00,124 epoch 2 - iter 117/136 - loss 0.13178598 - time (sec): 8.94 - samples/sec: 4987.74 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 20:49:01,135 epoch 2 - iter 130/136 - loss 0.12777214 - time (sec): 9.95 - samples/sec: 5003.89 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 20:49:01,563 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:49:01,564 EPOCH 2 done: loss 0.1263 - lr: 0.000045 |
|
2023-10-25 20:49:02,789 DEV : loss 0.11350668966770172 - f1-score (micro avg) 0.756 |
|
2023-10-25 20:49:02,795 saving best model |
|
2023-10-25 20:49:03,533 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:49:04,427 epoch 3 - iter 13/136 - loss 0.05059006 - time (sec): 0.89 - samples/sec: 4488.54 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 20:49:05,510 epoch 3 - iter 26/136 - loss 0.04888939 - time (sec): 1.98 - samples/sec: 5532.46 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 20:49:06,469 epoch 3 - iter 39/136 - loss 0.05876360 - time (sec): 2.93 - samples/sec: 5337.65 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 20:49:07,464 epoch 3 - iter 52/136 - loss 0.06067732 - time (sec): 3.93 - samples/sec: 5177.87 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 20:49:08,484 epoch 3 - iter 65/136 - loss 0.06412285 - time (sec): 4.95 - samples/sec: 5090.45 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 20:49:09,410 epoch 3 - iter 78/136 - loss 0.05836717 - time (sec): 5.88 - samples/sec: 5085.86 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 20:49:10,549 epoch 3 - iter 91/136 - loss 0.06151969 - time (sec): 7.01 - samples/sec: 5023.67 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 20:49:11,481 epoch 3 - iter 104/136 - loss 0.05937838 - time (sec): 7.95 - samples/sec: 5016.12 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 20:49:12,455 epoch 3 - iter 117/136 - loss 0.06045340 - time (sec): 8.92 - samples/sec: 4992.03 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 20:49:13,368 epoch 3 - iter 130/136 - loss 0.06224560 - time (sec): 9.83 - samples/sec: 5024.44 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 20:49:13,847 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:49:13,847 EPOCH 3 done: loss 0.0619 - lr: 0.000039 |
|
2023-10-25 20:49:15,053 DEV : loss 0.10497446358203888 - f1-score (micro avg) 0.7948 |
|
2023-10-25 20:49:15,059 saving best model |
|
2023-10-25 20:49:15,757 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:49:17,120 epoch 4 - iter 13/136 - loss 0.03225486 - time (sec): 1.36 - samples/sec: 4062.43 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 20:49:18,056 epoch 4 - iter 26/136 - loss 0.03426158 - time (sec): 2.30 - samples/sec: 4433.75 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 20:49:18,988 epoch 4 - iter 39/136 - loss 0.03368952 - time (sec): 3.23 - samples/sec: 4508.92 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 20:49:19,940 epoch 4 - iter 52/136 - loss 0.03541243 - time (sec): 4.18 - samples/sec: 4616.30 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 20:49:20,936 epoch 4 - iter 65/136 - loss 0.03199092 - time (sec): 5.18 - samples/sec: 4709.24 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 20:49:21,911 epoch 4 - iter 78/136 - loss 0.03412572 - time (sec): 6.15 - samples/sec: 4744.26 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 20:49:22,889 epoch 4 - iter 91/136 - loss 0.03643664 - time (sec): 7.13 - samples/sec: 4725.75 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 20:49:23,998 epoch 4 - iter 104/136 - loss 0.03548437 - time (sec): 8.24 - samples/sec: 4765.19 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 20:49:25,039 epoch 4 - iter 117/136 - loss 0.03535321 - time (sec): 9.28 - samples/sec: 4796.44 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 20:49:26,098 epoch 4 - iter 130/136 - loss 0.03515139 - time (sec): 10.34 - samples/sec: 4817.63 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 20:49:26,516 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:49:26,517 EPOCH 4 done: loss 0.0347 - lr: 0.000034 |
|
2023-10-25 20:49:27,683 DEV : loss 0.12341772019863129 - f1-score (micro avg) 0.7792 |
|
2023-10-25 20:49:27,690 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:49:28,564 epoch 5 - iter 13/136 - loss 0.03229045 - time (sec): 0.87 - samples/sec: 4762.42 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 20:49:29,465 epoch 5 - iter 26/136 - loss 0.02493843 - time (sec): 1.77 - samples/sec: 4963.40 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 20:49:30,459 epoch 5 - iter 39/136 - loss 0.02616394 - time (sec): 2.77 - samples/sec: 5211.24 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 20:49:31,481 epoch 5 - iter 52/136 - loss 0.02384447 - time (sec): 3.79 - samples/sec: 5137.87 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 20:49:32,418 epoch 5 - iter 65/136 - loss 0.02464925 - time (sec): 4.73 - samples/sec: 5146.67 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 20:49:33,481 epoch 5 - iter 78/136 - loss 0.02311740 - time (sec): 5.79 - samples/sec: 5121.45 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 20:49:34,479 epoch 5 - iter 91/136 - loss 0.02126136 - time (sec): 6.79 - samples/sec: 5043.65 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 20:49:35,514 epoch 5 - iter 104/136 - loss 0.01977040 - time (sec): 7.82 - samples/sec: 5014.30 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 20:49:36,506 epoch 5 - iter 117/136 - loss 0.01998176 - time (sec): 8.81 - samples/sec: 5070.73 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 20:49:37,444 epoch 5 - iter 130/136 - loss 0.01980121 - time (sec): 9.75 - samples/sec: 5099.45 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 20:49:37,938 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:49:37,938 EPOCH 5 done: loss 0.0213 - lr: 0.000028 |
|
2023-10-25 20:49:39,084 DEV : loss 0.16028477251529694 - f1-score (micro avg) 0.8065 |
|
2023-10-25 20:49:39,090 saving best model |
|
2023-10-25 20:49:40,135 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:49:41,088 epoch 6 - iter 13/136 - loss 0.01282957 - time (sec): 0.95 - samples/sec: 4745.81 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 20:49:42,058 epoch 6 - iter 26/136 - loss 0.01470040 - time (sec): 1.92 - samples/sec: 4861.14 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 20:49:42,994 epoch 6 - iter 39/136 - loss 0.01443160 - time (sec): 2.86 - samples/sec: 4943.95 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 20:49:43,983 epoch 6 - iter 52/136 - loss 0.01360045 - time (sec): 3.84 - samples/sec: 4981.36 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 20:49:44,925 epoch 6 - iter 65/136 - loss 0.01912094 - time (sec): 4.79 - samples/sec: 4974.54 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 20:49:46,002 epoch 6 - iter 78/136 - loss 0.01685248 - time (sec): 5.86 - samples/sec: 5113.66 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 20:49:47,019 epoch 6 - iter 91/136 - loss 0.01649329 - time (sec): 6.88 - samples/sec: 5088.69 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 20:49:47,919 epoch 6 - iter 104/136 - loss 0.01630070 - time (sec): 7.78 - samples/sec: 5077.11 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 20:49:48,888 epoch 6 - iter 117/136 - loss 0.01602221 - time (sec): 8.75 - samples/sec: 5158.73 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 20:49:49,720 epoch 6 - iter 130/136 - loss 0.01676743 - time (sec): 9.58 - samples/sec: 5216.10 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 20:49:50,144 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:49:50,144 EPOCH 6 done: loss 0.0167 - lr: 0.000023 |
|
2023-10-25 20:49:51,309 DEV : loss 0.1623808890581131 - f1-score (micro avg) 0.8036 |
|
2023-10-25 20:49:51,316 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:49:52,304 epoch 7 - iter 13/136 - loss 0.00863118 - time (sec): 0.99 - samples/sec: 5279.34 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 20:49:53,301 epoch 7 - iter 26/136 - loss 0.01283399 - time (sec): 1.98 - samples/sec: 5145.74 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 20:49:54,405 epoch 7 - iter 39/136 - loss 0.01289502 - time (sec): 3.09 - samples/sec: 4805.92 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 20:49:55,384 epoch 7 - iter 52/136 - loss 0.01267667 - time (sec): 4.07 - samples/sec: 4795.60 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 20:49:56,276 epoch 7 - iter 65/136 - loss 0.01198252 - time (sec): 4.96 - samples/sec: 4864.72 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 20:49:57,294 epoch 7 - iter 78/136 - loss 0.01219762 - time (sec): 5.98 - samples/sec: 4827.15 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 20:49:58,244 epoch 7 - iter 91/136 - loss 0.01107984 - time (sec): 6.93 - samples/sec: 4925.60 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 20:49:59,339 epoch 7 - iter 104/136 - loss 0.01149606 - time (sec): 8.02 - samples/sec: 4903.34 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 20:50:00,334 epoch 7 - iter 117/136 - loss 0.01108181 - time (sec): 9.02 - samples/sec: 4990.29 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 20:50:01,326 epoch 7 - iter 130/136 - loss 0.01021550 - time (sec): 10.01 - samples/sec: 4996.48 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 20:50:01,730 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:50:01,731 EPOCH 7 done: loss 0.0106 - lr: 0.000017 |
|
2023-10-25 20:50:02,910 DEV : loss 0.1861436367034912 - f1-score (micro avg) 0.8124 |
|
2023-10-25 20:50:02,917 saving best model |
|
2023-10-25 20:50:03,630 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:50:04,747 epoch 8 - iter 13/136 - loss 0.01917835 - time (sec): 1.11 - samples/sec: 4804.02 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 20:50:05,933 epoch 8 - iter 26/136 - loss 0.01454768 - time (sec): 2.30 - samples/sec: 4417.12 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 20:50:06,932 epoch 8 - iter 39/136 - loss 0.01120506 - time (sec): 3.30 - samples/sec: 4570.76 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 20:50:07,847 epoch 8 - iter 52/136 - loss 0.01089056 - time (sec): 4.21 - samples/sec: 4758.80 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 20:50:08,814 epoch 8 - iter 65/136 - loss 0.00989532 - time (sec): 5.18 - samples/sec: 4751.70 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 20:50:09,926 epoch 8 - iter 78/136 - loss 0.00913829 - time (sec): 6.29 - samples/sec: 4740.13 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 20:50:10,924 epoch 8 - iter 91/136 - loss 0.00925564 - time (sec): 7.29 - samples/sec: 4737.58 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 20:50:11,858 epoch 8 - iter 104/136 - loss 0.00947735 - time (sec): 8.22 - samples/sec: 4687.47 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 20:50:12,859 epoch 8 - iter 117/136 - loss 0.00912730 - time (sec): 9.23 - samples/sec: 4728.58 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 20:50:13,865 epoch 8 - iter 130/136 - loss 0.00862663 - time (sec): 10.23 - samples/sec: 4818.59 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 20:50:14,352 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:50:14,352 EPOCH 8 done: loss 0.0091 - lr: 0.000012 |
|
2023-10-25 20:50:15,604 DEV : loss 0.1832166463136673 - f1-score (micro avg) 0.8015 |
|
2023-10-25 20:50:15,611 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:50:16,589 epoch 9 - iter 13/136 - loss 0.00482522 - time (sec): 0.98 - samples/sec: 4981.60 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 20:50:17,538 epoch 9 - iter 26/136 - loss 0.00440215 - time (sec): 1.93 - samples/sec: 4894.16 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 20:50:18,569 epoch 9 - iter 39/136 - loss 0.00514377 - time (sec): 2.96 - samples/sec: 4809.01 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 20:50:19,529 epoch 9 - iter 52/136 - loss 0.00550980 - time (sec): 3.92 - samples/sec: 4939.16 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 20:50:20,578 epoch 9 - iter 65/136 - loss 0.00449967 - time (sec): 4.97 - samples/sec: 5142.06 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 20:50:21,531 epoch 9 - iter 78/136 - loss 0.00555236 - time (sec): 5.92 - samples/sec: 5145.84 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 20:50:22,443 epoch 9 - iter 91/136 - loss 0.00655557 - time (sec): 6.83 - samples/sec: 5097.48 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 20:50:23,408 epoch 9 - iter 104/136 - loss 0.00617669 - time (sec): 7.80 - samples/sec: 5121.25 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 20:50:24,387 epoch 9 - iter 117/136 - loss 0.00735580 - time (sec): 8.77 - samples/sec: 5191.84 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 20:50:25,321 epoch 9 - iter 130/136 - loss 0.00734840 - time (sec): 9.71 - samples/sec: 5174.57 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 20:50:25,672 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:50:25,673 EPOCH 9 done: loss 0.0074 - lr: 0.000006 |
|
2023-10-25 20:50:26,877 DEV : loss 0.19039608538150787 - f1-score (micro avg) 0.8227 |
|
2023-10-25 20:50:26,883 saving best model |
|
2023-10-25 20:50:27,706 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:50:28,600 epoch 10 - iter 13/136 - loss 0.00180411 - time (sec): 0.89 - samples/sec: 4851.61 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 20:50:29,885 epoch 10 - iter 26/136 - loss 0.00156497 - time (sec): 2.18 - samples/sec: 3965.23 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 20:50:30,853 epoch 10 - iter 39/136 - loss 0.00319720 - time (sec): 3.14 - samples/sec: 4417.44 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 20:50:31,936 epoch 10 - iter 52/136 - loss 0.00347400 - time (sec): 4.23 - samples/sec: 4648.61 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 20:50:32,863 epoch 10 - iter 65/136 - loss 0.00362540 - time (sec): 5.15 - samples/sec: 4575.68 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 20:50:33,819 epoch 10 - iter 78/136 - loss 0.00555418 - time (sec): 6.11 - samples/sec: 4590.04 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 20:50:34,858 epoch 10 - iter 91/136 - loss 0.00584215 - time (sec): 7.15 - samples/sec: 4649.20 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 20:50:35,906 epoch 10 - iter 104/136 - loss 0.00683194 - time (sec): 8.20 - samples/sec: 4753.70 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 20:50:36,840 epoch 10 - iter 117/136 - loss 0.00616078 - time (sec): 9.13 - samples/sec: 4801.67 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 20:50:37,883 epoch 10 - iter 130/136 - loss 0.00593112 - time (sec): 10.17 - samples/sec: 4830.32 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 20:50:38,383 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:50:38,383 EPOCH 10 done: loss 0.0056 - lr: 0.000000 |
|
2023-10-25 20:50:39,603 DEV : loss 0.19227778911590576 - f1-score (micro avg) 0.8162 |
|
2023-10-25 20:50:40,134 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:50:40,136 Loading model from best epoch ... |
|
2023-10-25 20:50:42,183 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-25 20:50:44,441 |
|
Results: |
|
- F-score (micro) 0.7744 |
|
- F-score (macro) 0.7288 |
|
- Accuracy 0.6521 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8255 0.8494 0.8373 312 |
|
PER 0.6654 0.8702 0.7542 208 |
|
ORG 0.4615 0.4364 0.4486 55 |
|
HumanProd 0.8077 0.9545 0.8750 22 |
|
|
|
micro avg 0.7317 0.8224 0.7744 597 |
|
macro avg 0.6901 0.7776 0.7288 597 |
|
weighted avg 0.7356 0.8224 0.7739 597 |
|
|
|
2023-10-25 20:50:44,441 ---------------------------------------------------------------------------------------------------- |
|
|