|
2023-10-20 00:22:24,768 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:24,768 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-20 00:22:24,768 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:24,768 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-20 00:22:24,769 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:24,769 Train: 1085 sentences |
|
2023-10-20 00:22:24,769 (train_with_dev=False, train_with_test=False) |
|
2023-10-20 00:22:24,769 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:24,769 Training Params: |
|
2023-10-20 00:22:24,769 - learning_rate: "3e-05" |
|
2023-10-20 00:22:24,769 - mini_batch_size: "8" |
|
2023-10-20 00:22:24,769 - max_epochs: "10" |
|
2023-10-20 00:22:24,769 - shuffle: "True" |
|
2023-10-20 00:22:24,769 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:24,769 Plugins: |
|
2023-10-20 00:22:24,769 - TensorboardLogger |
|
2023-10-20 00:22:24,769 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-20 00:22:24,769 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:24,769 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-20 00:22:24,769 - metric: "('micro avg', 'f1-score')" |
|
2023-10-20 00:22:24,769 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:24,769 Computation: |
|
2023-10-20 00:22:24,769 - compute on device: cuda:0 |
|
2023-10-20 00:22:24,769 - embedding storage: none |
|
2023-10-20 00:22:24,769 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:24,769 Model training base path: "hmbench-newseye/sv-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-20 00:22:24,769 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:24,769 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:24,769 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-20 00:22:25,129 epoch 1 - iter 13/136 - loss 3.44351759 - time (sec): 0.36 - samples/sec: 13857.83 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:22:25,490 epoch 1 - iter 26/136 - loss 3.46655674 - time (sec): 0.72 - samples/sec: 13842.40 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:22:25,837 epoch 1 - iter 39/136 - loss 3.43990278 - time (sec): 1.07 - samples/sec: 13769.94 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:22:26,213 epoch 1 - iter 52/136 - loss 3.38258302 - time (sec): 1.44 - samples/sec: 13992.74 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:22:26,566 epoch 1 - iter 65/136 - loss 3.30607240 - time (sec): 1.80 - samples/sec: 13965.12 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:22:26,919 epoch 1 - iter 78/136 - loss 3.19482226 - time (sec): 2.15 - samples/sec: 13967.02 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-20 00:22:27,285 epoch 1 - iter 91/136 - loss 3.09110143 - time (sec): 2.51 - samples/sec: 13585.65 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-20 00:22:27,651 epoch 1 - iter 104/136 - loss 2.92483015 - time (sec): 2.88 - samples/sec: 13860.57 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-20 00:22:28,008 epoch 1 - iter 117/136 - loss 2.76046108 - time (sec): 3.24 - samples/sec: 14163.43 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 00:22:28,348 epoch 1 - iter 130/136 - loss 2.63449514 - time (sec): 3.58 - samples/sec: 14087.01 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 00:22:28,498 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:28,498 EPOCH 1 done: loss 2.5888 - lr: 0.000028 |
|
2023-10-20 00:22:28,937 DEV : loss 0.6413638591766357 - f1-score (micro avg) 0.0 |
|
2023-10-20 00:22:28,941 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:29,312 epoch 2 - iter 13/136 - loss 0.98905430 - time (sec): 0.37 - samples/sec: 13635.64 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-20 00:22:29,645 epoch 2 - iter 26/136 - loss 0.88699646 - time (sec): 0.70 - samples/sec: 13806.84 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-20 00:22:29,988 epoch 2 - iter 39/136 - loss 0.81728117 - time (sec): 1.05 - samples/sec: 14204.64 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-20 00:22:30,333 epoch 2 - iter 52/136 - loss 0.77470536 - time (sec): 1.39 - samples/sec: 14322.33 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-20 00:22:30,677 epoch 2 - iter 65/136 - loss 0.76159135 - time (sec): 1.74 - samples/sec: 14557.88 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 00:22:31,027 epoch 2 - iter 78/136 - loss 0.72820103 - time (sec): 2.09 - samples/sec: 14276.83 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 00:22:31,400 epoch 2 - iter 91/136 - loss 0.71195032 - time (sec): 2.46 - samples/sec: 14309.40 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 00:22:31,745 epoch 2 - iter 104/136 - loss 0.70435205 - time (sec): 2.80 - samples/sec: 14160.93 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-20 00:22:32,097 epoch 2 - iter 117/136 - loss 0.70428881 - time (sec): 3.16 - samples/sec: 14341.73 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-20 00:22:32,446 epoch 2 - iter 130/136 - loss 0.69773696 - time (sec): 3.50 - samples/sec: 14192.01 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-20 00:22:32,606 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:32,606 EPOCH 2 done: loss 0.7011 - lr: 0.000027 |
|
2023-10-20 00:22:33,357 DEV : loss 0.4485773742198944 - f1-score (micro avg) 0.0 |
|
2023-10-20 00:22:33,360 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:33,721 epoch 3 - iter 13/136 - loss 0.55105478 - time (sec): 0.36 - samples/sec: 13870.94 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 00:22:34,055 epoch 3 - iter 26/136 - loss 0.58773544 - time (sec): 0.69 - samples/sec: 14285.98 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 00:22:34,414 epoch 3 - iter 39/136 - loss 0.56019690 - time (sec): 1.05 - samples/sec: 14441.41 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 00:22:34,759 epoch 3 - iter 52/136 - loss 0.54452968 - time (sec): 1.40 - samples/sec: 14420.73 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-20 00:22:35,112 epoch 3 - iter 65/136 - loss 0.54919669 - time (sec): 1.75 - samples/sec: 14716.65 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-20 00:22:35,482 epoch 3 - iter 78/136 - loss 0.54099639 - time (sec): 2.12 - samples/sec: 14437.67 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-20 00:22:35,829 epoch 3 - iter 91/136 - loss 0.54011844 - time (sec): 2.47 - samples/sec: 14245.04 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:22:36,186 epoch 3 - iter 104/136 - loss 0.54590190 - time (sec): 2.82 - samples/sec: 14338.01 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:22:36,551 epoch 3 - iter 117/136 - loss 0.55210968 - time (sec): 3.19 - samples/sec: 14134.85 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:22:36,893 epoch 3 - iter 130/136 - loss 0.55017447 - time (sec): 3.53 - samples/sec: 13969.71 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:22:37,063 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:37,064 EPOCH 3 done: loss 0.5547 - lr: 0.000024 |
|
2023-10-20 00:22:37,806 DEV : loss 0.4059444069862366 - f1-score (micro avg) 0.0 |
|
2023-10-20 00:22:37,810 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:38,182 epoch 4 - iter 13/136 - loss 0.51549779 - time (sec): 0.37 - samples/sec: 11724.55 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-20 00:22:38,505 epoch 4 - iter 26/136 - loss 0.51869795 - time (sec): 0.70 - samples/sec: 11612.06 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-20 00:22:38,863 epoch 4 - iter 39/136 - loss 0.51126903 - time (sec): 1.05 - samples/sec: 12725.40 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-20 00:22:39,224 epoch 4 - iter 52/136 - loss 0.49417554 - time (sec): 1.41 - samples/sec: 13332.33 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-20 00:22:39,558 epoch 4 - iter 65/136 - loss 0.49957532 - time (sec): 1.75 - samples/sec: 13210.93 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-20 00:22:39,937 epoch 4 - iter 78/136 - loss 0.50175770 - time (sec): 2.13 - samples/sec: 13904.79 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 00:22:40,316 epoch 4 - iter 91/136 - loss 0.50681614 - time (sec): 2.51 - samples/sec: 13967.11 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 00:22:40,827 epoch 4 - iter 104/136 - loss 0.51289799 - time (sec): 3.02 - samples/sec: 13393.98 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 00:22:41,175 epoch 4 - iter 117/136 - loss 0.52005990 - time (sec): 3.36 - samples/sec: 13435.60 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 00:22:41,533 epoch 4 - iter 130/136 - loss 0.51201849 - time (sec): 3.72 - samples/sec: 13348.19 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-20 00:22:41,705 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:41,705 EPOCH 4 done: loss 0.5105 - lr: 0.000020 |
|
2023-10-20 00:22:42,457 DEV : loss 0.37089869379997253 - f1-score (micro avg) 0.0142 |
|
2023-10-20 00:22:42,461 saving best model |
|
2023-10-20 00:22:42,488 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:42,863 epoch 5 - iter 13/136 - loss 0.42995486 - time (sec): 0.37 - samples/sec: 14289.13 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-20 00:22:43,200 epoch 5 - iter 26/136 - loss 0.47224038 - time (sec): 0.71 - samples/sec: 13857.96 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-20 00:22:43,561 epoch 5 - iter 39/136 - loss 0.46901777 - time (sec): 1.07 - samples/sec: 14143.76 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-20 00:22:43,934 epoch 5 - iter 52/136 - loss 0.46261360 - time (sec): 1.45 - samples/sec: 13805.11 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-20 00:22:44,261 epoch 5 - iter 65/136 - loss 0.46958411 - time (sec): 1.77 - samples/sec: 13687.77 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 00:22:44,629 epoch 5 - iter 78/136 - loss 0.46513553 - time (sec): 2.14 - samples/sec: 13564.04 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 00:22:44,983 epoch 5 - iter 91/136 - loss 0.47023229 - time (sec): 2.49 - samples/sec: 13623.38 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 00:22:45,331 epoch 5 - iter 104/136 - loss 0.45817745 - time (sec): 2.84 - samples/sec: 13584.07 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 00:22:45,685 epoch 5 - iter 117/136 - loss 0.45840126 - time (sec): 3.20 - samples/sec: 13745.81 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-20 00:22:46,057 epoch 5 - iter 130/136 - loss 0.45303335 - time (sec): 3.57 - samples/sec: 13983.13 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-20 00:22:46,213 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:46,214 EPOCH 5 done: loss 0.4545 - lr: 0.000017 |
|
2023-10-20 00:22:46,965 DEV : loss 0.3393750786781311 - f1-score (micro avg) 0.0598 |
|
2023-10-20 00:22:46,968 saving best model |
|
2023-10-20 00:22:47,004 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:47,346 epoch 6 - iter 13/136 - loss 0.44677072 - time (sec): 0.34 - samples/sec: 14141.18 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-20 00:22:47,678 epoch 6 - iter 26/136 - loss 0.47010642 - time (sec): 0.67 - samples/sec: 14319.92 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-20 00:22:48,026 epoch 6 - iter 39/136 - loss 0.46284259 - time (sec): 1.02 - samples/sec: 13984.99 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-20 00:22:48,385 epoch 6 - iter 52/136 - loss 0.43787820 - time (sec): 1.38 - samples/sec: 14168.64 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:22:48,762 epoch 6 - iter 65/136 - loss 0.43824809 - time (sec): 1.76 - samples/sec: 14104.75 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:22:49,114 epoch 6 - iter 78/136 - loss 0.43257744 - time (sec): 2.11 - samples/sec: 13902.04 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:22:49,479 epoch 6 - iter 91/136 - loss 0.44397018 - time (sec): 2.47 - samples/sec: 14385.82 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:22:49,840 epoch 6 - iter 104/136 - loss 0.44969610 - time (sec): 2.84 - samples/sec: 14272.23 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:22:50,187 epoch 6 - iter 117/136 - loss 0.44285295 - time (sec): 3.18 - samples/sec: 14359.24 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:22:50,521 epoch 6 - iter 130/136 - loss 0.44187721 - time (sec): 3.52 - samples/sec: 14101.80 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:22:50,685 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:50,686 EPOCH 6 done: loss 0.4416 - lr: 0.000014 |
|
2023-10-20 00:22:51,441 DEV : loss 0.3326244056224823 - f1-score (micro avg) 0.0772 |
|
2023-10-20 00:22:51,445 saving best model |
|
2023-10-20 00:22:51,476 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:51,862 epoch 7 - iter 13/136 - loss 0.36442837 - time (sec): 0.38 - samples/sec: 13852.34 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-20 00:22:52,221 epoch 7 - iter 26/136 - loss 0.38859612 - time (sec): 0.74 - samples/sec: 14113.87 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-20 00:22:52,592 epoch 7 - iter 39/136 - loss 0.39938690 - time (sec): 1.12 - samples/sec: 13814.36 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:22:53,118 epoch 7 - iter 52/136 - loss 0.43551536 - time (sec): 1.64 - samples/sec: 12014.72 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:22:53,482 epoch 7 - iter 65/136 - loss 0.41820471 - time (sec): 2.00 - samples/sec: 12279.26 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:22:53,821 epoch 7 - iter 78/136 - loss 0.40840389 - time (sec): 2.34 - samples/sec: 12608.63 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:22:54,187 epoch 7 - iter 91/136 - loss 0.40204516 - time (sec): 2.71 - samples/sec: 12939.68 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:22:54,513 epoch 7 - iter 104/136 - loss 0.41707340 - time (sec): 3.04 - samples/sec: 12794.91 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:22:54,869 epoch 7 - iter 117/136 - loss 0.42204175 - time (sec): 3.39 - samples/sec: 13128.70 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:22:55,219 epoch 7 - iter 130/136 - loss 0.42592096 - time (sec): 3.74 - samples/sec: 13271.04 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-20 00:22:55,380 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:55,380 EPOCH 7 done: loss 0.4260 - lr: 0.000010 |
|
2023-10-20 00:22:56,150 DEV : loss 0.3223443627357483 - f1-score (micro avg) 0.0807 |
|
2023-10-20 00:22:56,153 saving best model |
|
2023-10-20 00:22:56,183 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:56,542 epoch 8 - iter 13/136 - loss 0.45169781 - time (sec): 0.36 - samples/sec: 13586.26 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-20 00:22:56,893 epoch 8 - iter 26/136 - loss 0.41974333 - time (sec): 0.71 - samples/sec: 13941.90 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:22:57,272 epoch 8 - iter 39/136 - loss 0.41289236 - time (sec): 1.09 - samples/sec: 13931.54 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:22:57,630 epoch 8 - iter 52/136 - loss 0.39780873 - time (sec): 1.45 - samples/sec: 13880.26 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:22:57,996 epoch 8 - iter 65/136 - loss 0.41163537 - time (sec): 1.81 - samples/sec: 13923.22 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:22:58,345 epoch 8 - iter 78/136 - loss 0.41401902 - time (sec): 2.16 - samples/sec: 13867.44 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:22:58,701 epoch 8 - iter 91/136 - loss 0.40917998 - time (sec): 2.52 - samples/sec: 13962.43 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:22:59,008 epoch 8 - iter 104/136 - loss 0.40545870 - time (sec): 2.82 - samples/sec: 14503.46 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:22:59,318 epoch 8 - iter 117/136 - loss 0.40534738 - time (sec): 3.13 - samples/sec: 14541.73 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-20 00:22:59,622 epoch 8 - iter 130/136 - loss 0.41156594 - time (sec): 3.44 - samples/sec: 14650.13 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-20 00:22:59,773 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:22:59,773 EPOCH 8 done: loss 0.4126 - lr: 0.000007 |
|
2023-10-20 00:23:00,531 DEV : loss 0.31802523136138916 - f1-score (micro avg) 0.1015 |
|
2023-10-20 00:23:00,535 saving best model |
|
2023-10-20 00:23:00,566 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:23:00,926 epoch 9 - iter 13/136 - loss 0.40999310 - time (sec): 0.36 - samples/sec: 14241.92 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:23:01,283 epoch 9 - iter 26/136 - loss 0.40070447 - time (sec): 0.72 - samples/sec: 14007.90 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:23:01,637 epoch 9 - iter 39/136 - loss 0.43119169 - time (sec): 1.07 - samples/sec: 13630.77 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:23:01,990 epoch 9 - iter 52/136 - loss 0.43179866 - time (sec): 1.42 - samples/sec: 13018.77 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:23:02,350 epoch 9 - iter 65/136 - loss 0.42187229 - time (sec): 1.78 - samples/sec: 14042.08 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 00:23:02,719 epoch 9 - iter 78/136 - loss 0.41991045 - time (sec): 2.15 - samples/sec: 14250.53 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 00:23:03,057 epoch 9 - iter 91/136 - loss 0.41320907 - time (sec): 2.49 - samples/sec: 14079.07 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 00:23:03,402 epoch 9 - iter 104/136 - loss 0.41152836 - time (sec): 2.84 - samples/sec: 14022.38 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:23:03,747 epoch 9 - iter 117/136 - loss 0.40616924 - time (sec): 3.18 - samples/sec: 14097.87 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:23:04,116 epoch 9 - iter 130/136 - loss 0.40568164 - time (sec): 3.55 - samples/sec: 14078.32 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:23:04,283 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:23:04,283 EPOCH 9 done: loss 0.4081 - lr: 0.000004 |
|
2023-10-20 00:23:05,049 DEV : loss 0.3174845576286316 - f1-score (micro avg) 0.119 |
|
2023-10-20 00:23:05,053 saving best model |
|
2023-10-20 00:23:05,084 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:23:05,448 epoch 10 - iter 13/136 - loss 0.44732210 - time (sec): 0.36 - samples/sec: 12951.68 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:23:05,984 epoch 10 - iter 26/136 - loss 0.43695884 - time (sec): 0.90 - samples/sec: 11357.10 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:23:06,320 epoch 10 - iter 39/136 - loss 0.41842028 - time (sec): 1.24 - samples/sec: 12027.01 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:23:06,657 epoch 10 - iter 52/136 - loss 0.41571030 - time (sec): 1.57 - samples/sec: 12230.62 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 00:23:06,997 epoch 10 - iter 65/136 - loss 0.42240301 - time (sec): 1.91 - samples/sec: 12665.19 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 00:23:07,336 epoch 10 - iter 78/136 - loss 0.41719114 - time (sec): 2.25 - samples/sec: 12738.21 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 00:23:07,691 epoch 10 - iter 91/136 - loss 0.41282796 - time (sec): 2.61 - samples/sec: 13124.03 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 00:23:08,053 epoch 10 - iter 104/136 - loss 0.41292833 - time (sec): 2.97 - samples/sec: 13337.59 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 00:23:08,405 epoch 10 - iter 117/136 - loss 0.40670187 - time (sec): 3.32 - samples/sec: 13577.45 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 00:23:08,750 epoch 10 - iter 130/136 - loss 0.40537168 - time (sec): 3.67 - samples/sec: 13612.30 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-20 00:23:08,917 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:23:08,917 EPOCH 10 done: loss 0.4046 - lr: 0.000000 |
|
2023-10-20 00:23:09,706 DEV : loss 0.3158849775791168 - f1-score (micro avg) 0.1294 |
|
2023-10-20 00:23:09,710 saving best model |
|
2023-10-20 00:23:09,765 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:23:09,766 Loading model from best epoch ... |
|
2023-10-20 00:23:09,840 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-20 00:23:10,631 |
|
Results: |
|
- F-score (micro) 0.1158 |
|
- F-score (macro) 0.0596 |
|
- Accuracy 0.0634 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.1696 0.1875 0.1781 208 |
|
LOC 0.5263 0.0321 0.0604 312 |
|
ORG 0.0000 0.0000 0.0000 55 |
|
HumanProd 0.0000 0.0000 0.0000 22 |
|
|
|
micro avg 0.1968 0.0821 0.1158 597 |
|
macro avg 0.1740 0.0549 0.0596 597 |
|
weighted avg 0.3341 0.0821 0.0936 597 |
|
|
|
2023-10-20 00:23:10,631 ---------------------------------------------------------------------------------------------------- |
|
|