|
2023-10-25 20:59:44,781 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:44,782 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 20:59:44,782 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:44,783 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-25 20:59:44,783 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:44,783 Train: 1085 sentences |
|
2023-10-25 20:59:44,783 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 20:59:44,783 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:44,783 Training Params: |
|
2023-10-25 20:59:44,783 - learning_rate: "5e-05" |
|
2023-10-25 20:59:44,783 - mini_batch_size: "8" |
|
2023-10-25 20:59:44,783 - max_epochs: "10" |
|
2023-10-25 20:59:44,783 - shuffle: "True" |
|
2023-10-25 20:59:44,783 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:44,783 Plugins: |
|
2023-10-25 20:59:44,783 - TensorboardLogger |
|
2023-10-25 20:59:44,783 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 20:59:44,783 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:44,783 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 20:59:44,783 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 20:59:44,783 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:44,783 Computation: |
|
2023-10-25 20:59:44,784 - compute on device: cuda:0 |
|
2023-10-25 20:59:44,784 - embedding storage: none |
|
2023-10-25 20:59:44,784 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:44,784 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-25 20:59:44,784 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:44,784 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:44,784 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 20:59:45,775 epoch 1 - iter 13/136 - loss 2.85947296 - time (sec): 0.99 - samples/sec: 5097.94 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 20:59:46,798 epoch 1 - iter 26/136 - loss 2.26347229 - time (sec): 2.01 - samples/sec: 5059.75 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 20:59:47,774 epoch 1 - iter 39/136 - loss 1.70067118 - time (sec): 2.99 - samples/sec: 5065.52 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 20:59:48,854 epoch 1 - iter 52/136 - loss 1.34707525 - time (sec): 4.07 - samples/sec: 5189.88 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 20:59:49,781 epoch 1 - iter 65/136 - loss 1.18558706 - time (sec): 5.00 - samples/sec: 5134.55 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 20:59:50,747 epoch 1 - iter 78/136 - loss 1.05425464 - time (sec): 5.96 - samples/sec: 5054.21 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 20:59:51,963 epoch 1 - iter 91/136 - loss 0.91480738 - time (sec): 7.18 - samples/sec: 5050.33 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 20:59:52,946 epoch 1 - iter 104/136 - loss 0.83931390 - time (sec): 8.16 - samples/sec: 5003.54 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 20:59:53,951 epoch 1 - iter 117/136 - loss 0.77070604 - time (sec): 9.17 - samples/sec: 4961.74 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 20:59:54,977 epoch 1 - iter 130/136 - loss 0.72306304 - time (sec): 10.19 - samples/sec: 4933.66 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 20:59:55,375 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:55,376 EPOCH 1 done: loss 0.7034 - lr: 0.000047 |
|
2023-10-25 20:59:56,927 DEV : loss 0.15223081409931183 - f1-score (micro avg) 0.6703 |
|
2023-10-25 20:59:56,935 saving best model |
|
2023-10-25 20:59:57,569 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:58,582 epoch 2 - iter 13/136 - loss 0.16703540 - time (sec): 1.01 - samples/sec: 4671.90 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-25 20:59:59,547 epoch 2 - iter 26/136 - loss 0.15937506 - time (sec): 1.98 - samples/sec: 4707.65 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 21:00:00,594 epoch 2 - iter 39/136 - loss 0.15134578 - time (sec): 3.02 - samples/sec: 4812.24 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 21:00:01,675 epoch 2 - iter 52/136 - loss 0.14389325 - time (sec): 4.10 - samples/sec: 4706.84 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 21:00:02,708 epoch 2 - iter 65/136 - loss 0.14979586 - time (sec): 5.14 - samples/sec: 4760.65 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 21:00:03,789 epoch 2 - iter 78/136 - loss 0.13989186 - time (sec): 6.22 - samples/sec: 4835.41 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 21:00:04,794 epoch 2 - iter 91/136 - loss 0.14397482 - time (sec): 7.22 - samples/sec: 4808.49 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 21:00:05,746 epoch 2 - iter 104/136 - loss 0.14130762 - time (sec): 8.17 - samples/sec: 4834.70 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 21:00:06,868 epoch 2 - iter 117/136 - loss 0.13753071 - time (sec): 9.30 - samples/sec: 4842.68 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 21:00:07,855 epoch 2 - iter 130/136 - loss 0.13327994 - time (sec): 10.28 - samples/sec: 4878.03 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 21:00:08,266 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:00:08,267 EPOCH 2 done: loss 0.1319 - lr: 0.000045 |
|
2023-10-25 21:00:09,575 DEV : loss 0.1049661636352539 - f1-score (micro avg) 0.7519 |
|
2023-10-25 21:00:09,581 saving best model |
|
2023-10-25 21:00:10,357 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:00:11,351 epoch 3 - iter 13/136 - loss 0.06158613 - time (sec): 0.99 - samples/sec: 5276.59 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 21:00:12,274 epoch 3 - iter 26/136 - loss 0.06262839 - time (sec): 1.92 - samples/sec: 5255.14 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 21:00:13,281 epoch 3 - iter 39/136 - loss 0.06000293 - time (sec): 2.92 - samples/sec: 5250.88 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 21:00:14,303 epoch 3 - iter 52/136 - loss 0.06495053 - time (sec): 3.94 - samples/sec: 5114.57 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 21:00:15,213 epoch 3 - iter 65/136 - loss 0.06661245 - time (sec): 4.85 - samples/sec: 5054.42 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 21:00:16,327 epoch 3 - iter 78/136 - loss 0.06974909 - time (sec): 5.97 - samples/sec: 4928.34 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 21:00:17,283 epoch 3 - iter 91/136 - loss 0.06900942 - time (sec): 6.92 - samples/sec: 5001.33 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 21:00:18,327 epoch 3 - iter 104/136 - loss 0.06816376 - time (sec): 7.97 - samples/sec: 4941.16 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 21:00:19,396 epoch 3 - iter 117/136 - loss 0.06920871 - time (sec): 9.04 - samples/sec: 4931.17 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 21:00:20,371 epoch 3 - iter 130/136 - loss 0.06823123 - time (sec): 10.01 - samples/sec: 4946.54 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 21:00:20,863 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:00:20,863 EPOCH 3 done: loss 0.0670 - lr: 0.000039 |
|
2023-10-25 21:00:22,080 DEV : loss 0.10162875056266785 - f1-score (micro avg) 0.7889 |
|
2023-10-25 21:00:22,089 saving best model |
|
2023-10-25 21:00:22,834 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:00:24,300 epoch 4 - iter 13/136 - loss 0.04730517 - time (sec): 1.46 - samples/sec: 3970.30 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 21:00:25,192 epoch 4 - iter 26/136 - loss 0.04042726 - time (sec): 2.35 - samples/sec: 4381.47 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 21:00:26,282 epoch 4 - iter 39/136 - loss 0.03644611 - time (sec): 3.44 - samples/sec: 4303.84 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 21:00:27,369 epoch 4 - iter 52/136 - loss 0.03410621 - time (sec): 4.53 - samples/sec: 4514.71 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 21:00:28,493 epoch 4 - iter 65/136 - loss 0.04025331 - time (sec): 5.65 - samples/sec: 4490.11 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 21:00:29,464 epoch 4 - iter 78/136 - loss 0.04364137 - time (sec): 6.63 - samples/sec: 4521.96 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 21:00:30,468 epoch 4 - iter 91/136 - loss 0.04264546 - time (sec): 7.63 - samples/sec: 4533.25 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 21:00:31,368 epoch 4 - iter 104/136 - loss 0.04203329 - time (sec): 8.53 - samples/sec: 4600.66 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 21:00:32,340 epoch 4 - iter 117/136 - loss 0.04133705 - time (sec): 9.50 - samples/sec: 4692.10 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 21:00:33,290 epoch 4 - iter 130/136 - loss 0.04059355 - time (sec): 10.45 - samples/sec: 4737.24 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 21:00:33,721 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:00:33,721 EPOCH 4 done: loss 0.0405 - lr: 0.000034 |
|
2023-10-25 21:00:34,927 DEV : loss 0.10846184939146042 - f1-score (micro avg) 0.8308 |
|
2023-10-25 21:00:34,936 saving best model |
|
2023-10-25 21:00:35,682 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:00:36,629 epoch 5 - iter 13/136 - loss 0.03141050 - time (sec): 0.94 - samples/sec: 4559.48 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 21:00:37,751 epoch 5 - iter 26/136 - loss 0.02872201 - time (sec): 2.07 - samples/sec: 4918.32 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 21:00:38,592 epoch 5 - iter 39/136 - loss 0.03507623 - time (sec): 2.91 - samples/sec: 4748.68 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 21:00:39,604 epoch 5 - iter 52/136 - loss 0.03164434 - time (sec): 3.92 - samples/sec: 4779.15 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 21:00:40,516 epoch 5 - iter 65/136 - loss 0.02918988 - time (sec): 4.83 - samples/sec: 4808.93 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 21:00:41,612 epoch 5 - iter 78/136 - loss 0.03150872 - time (sec): 5.93 - samples/sec: 4825.89 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 21:00:42,606 epoch 5 - iter 91/136 - loss 0.02890380 - time (sec): 6.92 - samples/sec: 4803.11 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 21:00:43,731 epoch 5 - iter 104/136 - loss 0.02786063 - time (sec): 8.05 - samples/sec: 4789.05 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 21:00:44,673 epoch 5 - iter 117/136 - loss 0.02639590 - time (sec): 8.99 - samples/sec: 4800.15 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 21:00:45,741 epoch 5 - iter 130/136 - loss 0.02732309 - time (sec): 10.06 - samples/sec: 4884.75 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 21:00:46,285 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:00:46,285 EPOCH 5 done: loss 0.0269 - lr: 0.000028 |
|
2023-10-25 21:00:47,478 DEV : loss 0.12008755654096603 - f1-score (micro avg) 0.8229 |
|
2023-10-25 21:00:47,486 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:00:48,513 epoch 6 - iter 13/136 - loss 0.01660326 - time (sec): 1.02 - samples/sec: 5215.58 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 21:00:49,506 epoch 6 - iter 26/136 - loss 0.01853080 - time (sec): 2.02 - samples/sec: 4998.73 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 21:00:51,193 epoch 6 - iter 39/136 - loss 0.02629362 - time (sec): 3.71 - samples/sec: 4163.95 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 21:00:52,218 epoch 6 - iter 52/136 - loss 0.02289507 - time (sec): 4.73 - samples/sec: 4335.97 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 21:00:53,321 epoch 6 - iter 65/136 - loss 0.02171427 - time (sec): 5.83 - samples/sec: 4421.36 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 21:00:54,469 epoch 6 - iter 78/136 - loss 0.02066805 - time (sec): 6.98 - samples/sec: 4438.53 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 21:00:55,500 epoch 6 - iter 91/136 - loss 0.01979943 - time (sec): 8.01 - samples/sec: 4490.87 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 21:00:56,454 epoch 6 - iter 104/136 - loss 0.01868719 - time (sec): 8.97 - samples/sec: 4542.88 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 21:00:57,486 epoch 6 - iter 117/136 - loss 0.01750765 - time (sec): 10.00 - samples/sec: 4564.53 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 21:00:58,488 epoch 6 - iter 130/136 - loss 0.01957900 - time (sec): 11.00 - samples/sec: 4564.31 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 21:00:58,902 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:00:58,902 EPOCH 6 done: loss 0.0193 - lr: 0.000023 |
|
2023-10-25 21:01:00,093 DEV : loss 0.1586846262216568 - f1-score (micro avg) 0.8037 |
|
2023-10-25 21:01:00,100 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:01:01,149 epoch 7 - iter 13/136 - loss 0.01331116 - time (sec): 1.05 - samples/sec: 4710.62 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 21:01:02,159 epoch 7 - iter 26/136 - loss 0.01370313 - time (sec): 2.06 - samples/sec: 4703.42 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 21:01:03,239 epoch 7 - iter 39/136 - loss 0.01517733 - time (sec): 3.14 - samples/sec: 4943.03 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 21:01:04,235 epoch 7 - iter 52/136 - loss 0.01384388 - time (sec): 4.13 - samples/sec: 5033.02 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 21:01:05,261 epoch 7 - iter 65/136 - loss 0.01349590 - time (sec): 5.16 - samples/sec: 5124.49 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 21:01:06,278 epoch 7 - iter 78/136 - loss 0.01274396 - time (sec): 6.18 - samples/sec: 5199.20 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 21:01:07,152 epoch 7 - iter 91/136 - loss 0.01291160 - time (sec): 7.05 - samples/sec: 5120.28 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 21:01:08,303 epoch 7 - iter 104/136 - loss 0.01227420 - time (sec): 8.20 - samples/sec: 5082.76 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 21:01:09,258 epoch 7 - iter 117/136 - loss 0.01318525 - time (sec): 9.16 - samples/sec: 5063.74 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 21:01:10,116 epoch 7 - iter 130/136 - loss 0.01416944 - time (sec): 10.01 - samples/sec: 4972.56 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 21:01:10,563 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:01:10,563 EPOCH 7 done: loss 0.0141 - lr: 0.000017 |
|
2023-10-25 21:01:11,820 DEV : loss 0.16495871543884277 - f1-score (micro avg) 0.8148 |
|
2023-10-25 21:01:11,826 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:01:12,857 epoch 8 - iter 13/136 - loss 0.01122739 - time (sec): 1.03 - samples/sec: 5265.27 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 21:01:14,332 epoch 8 - iter 26/136 - loss 0.00858612 - time (sec): 2.50 - samples/sec: 4312.20 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 21:01:15,304 epoch 8 - iter 39/136 - loss 0.00691838 - time (sec): 3.48 - samples/sec: 4508.95 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:01:16,236 epoch 8 - iter 52/136 - loss 0.00755005 - time (sec): 4.41 - samples/sec: 4682.13 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:01:17,275 epoch 8 - iter 65/136 - loss 0.00851409 - time (sec): 5.45 - samples/sec: 4800.93 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 21:01:18,133 epoch 8 - iter 78/136 - loss 0.01000584 - time (sec): 6.31 - samples/sec: 4757.37 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 21:01:19,143 epoch 8 - iter 91/136 - loss 0.00949333 - time (sec): 7.32 - samples/sec: 4784.99 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 21:01:20,118 epoch 8 - iter 104/136 - loss 0.00987032 - time (sec): 8.29 - samples/sec: 4794.40 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 21:01:21,124 epoch 8 - iter 117/136 - loss 0.01035730 - time (sec): 9.30 - samples/sec: 4836.36 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 21:01:22,127 epoch 8 - iter 130/136 - loss 0.00957887 - time (sec): 10.30 - samples/sec: 4838.75 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 21:01:22,545 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:01:22,546 EPOCH 8 done: loss 0.0102 - lr: 0.000012 |
|
2023-10-25 21:01:23,746 DEV : loss 0.16520875692367554 - f1-score (micro avg) 0.837 |
|
2023-10-25 21:01:23,753 saving best model |
|
2023-10-25 21:01:24,481 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:01:25,536 epoch 9 - iter 13/136 - loss 0.00127382 - time (sec): 1.05 - samples/sec: 5511.06 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 21:01:26,507 epoch 9 - iter 26/136 - loss 0.00294119 - time (sec): 2.02 - samples/sec: 4875.77 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 21:01:27,476 epoch 9 - iter 39/136 - loss 0.00448983 - time (sec): 2.99 - samples/sec: 5024.62 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 21:01:28,555 epoch 9 - iter 52/136 - loss 0.00746805 - time (sec): 4.07 - samples/sec: 5126.83 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 21:01:29,495 epoch 9 - iter 65/136 - loss 0.00753869 - time (sec): 5.01 - samples/sec: 5092.24 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 21:01:30,570 epoch 9 - iter 78/136 - loss 0.00640345 - time (sec): 6.09 - samples/sec: 5044.18 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 21:01:31,486 epoch 9 - iter 91/136 - loss 0.00658349 - time (sec): 7.00 - samples/sec: 4964.34 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 21:01:32,359 epoch 9 - iter 104/136 - loss 0.00725150 - time (sec): 7.88 - samples/sec: 4989.10 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 21:01:33,337 epoch 9 - iter 117/136 - loss 0.00741016 - time (sec): 8.85 - samples/sec: 5028.89 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 21:01:34,317 epoch 9 - iter 130/136 - loss 0.00699976 - time (sec): 9.83 - samples/sec: 5058.75 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:01:34,750 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:01:34,750 EPOCH 9 done: loss 0.0071 - lr: 0.000006 |
|
2023-10-25 21:01:35,934 DEV : loss 0.1780930459499359 - f1-score (micro avg) 0.8272 |
|
2023-10-25 21:01:35,941 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:01:37,319 epoch 10 - iter 13/136 - loss 0.01811245 - time (sec): 1.38 - samples/sec: 3085.75 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 21:01:38,256 epoch 10 - iter 26/136 - loss 0.00912043 - time (sec): 2.31 - samples/sec: 3837.60 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 21:01:39,208 epoch 10 - iter 39/136 - loss 0.00728411 - time (sec): 3.26 - samples/sec: 4173.53 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 21:01:40,251 epoch 10 - iter 52/136 - loss 0.00678921 - time (sec): 4.31 - samples/sec: 4258.00 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 21:01:41,434 epoch 10 - iter 65/136 - loss 0.00635061 - time (sec): 5.49 - samples/sec: 4445.58 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 21:01:42,447 epoch 10 - iter 78/136 - loss 0.00610468 - time (sec): 6.50 - samples/sec: 4563.57 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 21:01:43,338 epoch 10 - iter 91/136 - loss 0.00568238 - time (sec): 7.39 - samples/sec: 4581.05 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 21:01:44,373 epoch 10 - iter 104/136 - loss 0.00490935 - time (sec): 8.43 - samples/sec: 4696.12 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 21:01:45,286 epoch 10 - iter 117/136 - loss 0.00489042 - time (sec): 9.34 - samples/sec: 4728.55 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 21:01:46,404 epoch 10 - iter 130/136 - loss 0.00561185 - time (sec): 10.46 - samples/sec: 4765.39 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 21:01:46,871 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:01:46,872 EPOCH 10 done: loss 0.0054 - lr: 0.000000 |
|
2023-10-25 21:01:48,026 DEV : loss 0.17910274863243103 - f1-score (micro avg) 0.8407 |
|
2023-10-25 21:01:48,032 saving best model |
|
2023-10-25 21:01:49,290 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:01:49,292 Loading model from best epoch ... |
|
2023-10-25 21:01:51,250 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-25 21:01:53,287 |
|
Results: |
|
- F-score (micro) 0.7833 |
|
- F-score (macro) 0.7357 |
|
- Accuracy 0.6618 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8060 0.8654 0.8346 312 |
|
PER 0.7047 0.8606 0.7749 208 |
|
ORG 0.4912 0.5091 0.5000 55 |
|
HumanProd 0.7692 0.9091 0.8333 22 |
|
|
|
micro avg 0.7396 0.8325 0.7833 597 |
|
macro avg 0.6928 0.7860 0.7357 597 |
|
weighted avg 0.7403 0.8325 0.7829 597 |
|
|
|
2023-10-25 21:01:53,288 ---------------------------------------------------------------------------------------------------- |
|
|