|
2023-10-19 23:57:29,912 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:29,913 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-19 23:57:29,913 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:29,913 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-19 23:57:29,913 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:29,913 Train: 1166 sentences |
|
2023-10-19 23:57:29,913 (train_with_dev=False, train_with_test=False) |
|
2023-10-19 23:57:29,913 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:29,913 Training Params: |
|
2023-10-19 23:57:29,913 - learning_rate: "3e-05" |
|
2023-10-19 23:57:29,913 - mini_batch_size: "4" |
|
2023-10-19 23:57:29,913 - max_epochs: "10" |
|
2023-10-19 23:57:29,913 - shuffle: "True" |
|
2023-10-19 23:57:29,913 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:29,913 Plugins: |
|
2023-10-19 23:57:29,913 - TensorboardLogger |
|
2023-10-19 23:57:29,913 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-19 23:57:29,913 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:29,913 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-19 23:57:29,913 - metric: "('micro avg', 'f1-score')" |
|
2023-10-19 23:57:29,914 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:29,914 Computation: |
|
2023-10-19 23:57:29,914 - compute on device: cuda:0 |
|
2023-10-19 23:57:29,914 - embedding storage: none |
|
2023-10-19 23:57:29,914 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:29,914 Model training base path: "hmbench-newseye/fi-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-19 23:57:29,914 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:29,914 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:29,914 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-19 23:57:30,355 epoch 1 - iter 29/292 - loss 3.15858994 - time (sec): 0.44 - samples/sec: 8983.13 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:57:30,831 epoch 1 - iter 58/292 - loss 3.12098443 - time (sec): 0.92 - samples/sec: 8389.04 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:57:31,365 epoch 1 - iter 87/292 - loss 3.09355439 - time (sec): 1.45 - samples/sec: 8367.32 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 23:57:31,877 epoch 1 - iter 116/292 - loss 2.97640957 - time (sec): 1.96 - samples/sec: 8214.22 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:57:32,386 epoch 1 - iter 145/292 - loss 2.82557112 - time (sec): 2.47 - samples/sec: 8349.50 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:57:32,887 epoch 1 - iter 174/292 - loss 2.65406568 - time (sec): 2.97 - samples/sec: 8321.21 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:57:33,409 epoch 1 - iter 203/292 - loss 2.39656343 - time (sec): 3.50 - samples/sec: 8608.21 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:57:33,935 epoch 1 - iter 232/292 - loss 2.21048062 - time (sec): 4.02 - samples/sec: 8588.98 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:57:34,473 epoch 1 - iter 261/292 - loss 2.03861344 - time (sec): 4.56 - samples/sec: 8670.12 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:57:34,988 epoch 1 - iter 290/292 - loss 1.90732826 - time (sec): 5.07 - samples/sec: 8730.43 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 23:57:35,016 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:35,016 EPOCH 1 done: loss 1.9026 - lr: 0.000030 |
|
2023-10-19 23:57:35,275 DEV : loss 0.4726361334323883 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:57:35,279 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:35,786 epoch 2 - iter 29/292 - loss 0.88527386 - time (sec): 0.51 - samples/sec: 9692.47 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 23:57:36,304 epoch 2 - iter 58/292 - loss 0.82889086 - time (sec): 1.02 - samples/sec: 9394.52 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 23:57:36,803 epoch 2 - iter 87/292 - loss 0.78547384 - time (sec): 1.52 - samples/sec: 9030.81 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 23:57:37,269 epoch 2 - iter 116/292 - loss 0.78010344 - time (sec): 1.99 - samples/sec: 8830.24 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 23:57:37,757 epoch 2 - iter 145/292 - loss 0.75316303 - time (sec): 2.48 - samples/sec: 8742.69 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:57:38,256 epoch 2 - iter 174/292 - loss 0.73185571 - time (sec): 2.98 - samples/sec: 8727.51 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:57:38,755 epoch 2 - iter 203/292 - loss 0.71216728 - time (sec): 3.48 - samples/sec: 8641.50 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:57:39,287 epoch 2 - iter 232/292 - loss 0.67823755 - time (sec): 4.01 - samples/sec: 8867.99 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:57:39,818 epoch 2 - iter 261/292 - loss 0.66528632 - time (sec): 4.54 - samples/sec: 8918.28 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:57:40,304 epoch 2 - iter 290/292 - loss 0.66538611 - time (sec): 5.02 - samples/sec: 8778.91 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:57:40,338 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:40,338 EPOCH 2 done: loss 0.6640 - lr: 0.000027 |
|
2023-10-19 23:57:40,964 DEV : loss 0.4068935811519623 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:57:40,968 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:41,468 epoch 3 - iter 29/292 - loss 0.50362782 - time (sec): 0.50 - samples/sec: 8760.21 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:57:41,968 epoch 3 - iter 58/292 - loss 0.52872549 - time (sec): 1.00 - samples/sec: 8770.81 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:57:42,484 epoch 3 - iter 87/292 - loss 0.55048926 - time (sec): 1.51 - samples/sec: 8999.06 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:57:43,018 epoch 3 - iter 116/292 - loss 0.59262551 - time (sec): 2.05 - samples/sec: 8752.53 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:57:43,690 epoch 3 - iter 145/292 - loss 0.58792425 - time (sec): 2.72 - samples/sec: 8189.27 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:57:44,225 epoch 3 - iter 174/292 - loss 0.57780683 - time (sec): 3.26 - samples/sec: 8332.04 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:57:44,708 epoch 3 - iter 203/292 - loss 0.57106053 - time (sec): 3.74 - samples/sec: 8318.01 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:57:45,293 epoch 3 - iter 232/292 - loss 0.56074661 - time (sec): 4.32 - samples/sec: 8263.42 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:57:45,792 epoch 3 - iter 261/292 - loss 0.55453111 - time (sec): 4.82 - samples/sec: 8194.65 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:57:46,308 epoch 3 - iter 290/292 - loss 0.55053773 - time (sec): 5.34 - samples/sec: 8260.74 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:57:46,345 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:46,345 EPOCH 3 done: loss 0.5490 - lr: 0.000023 |
|
2023-10-19 23:57:46,967 DEV : loss 0.3729143738746643 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:57:46,971 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:47,455 epoch 4 - iter 29/292 - loss 0.44360140 - time (sec): 0.48 - samples/sec: 8111.28 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:57:47,971 epoch 4 - iter 58/292 - loss 0.46127904 - time (sec): 1.00 - samples/sec: 8078.16 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:57:48,506 epoch 4 - iter 87/292 - loss 0.45386077 - time (sec): 1.53 - samples/sec: 8329.82 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:57:49,025 epoch 4 - iter 116/292 - loss 0.45066227 - time (sec): 2.05 - samples/sec: 8285.59 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:57:49,555 epoch 4 - iter 145/292 - loss 0.45473106 - time (sec): 2.58 - samples/sec: 8211.85 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:57:50,065 epoch 4 - iter 174/292 - loss 0.45803790 - time (sec): 3.09 - samples/sec: 8220.40 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:57:50,606 epoch 4 - iter 203/292 - loss 0.47679097 - time (sec): 3.63 - samples/sec: 8472.15 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:57:51,100 epoch 4 - iter 232/292 - loss 0.47649244 - time (sec): 4.13 - samples/sec: 8360.31 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:57:51,602 epoch 4 - iter 261/292 - loss 0.47283559 - time (sec): 4.63 - samples/sec: 8370.13 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:57:52,144 epoch 4 - iter 290/292 - loss 0.47776120 - time (sec): 5.17 - samples/sec: 8569.51 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:57:52,176 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:52,176 EPOCH 4 done: loss 0.4775 - lr: 0.000020 |
|
2023-10-19 23:57:52,799 DEV : loss 0.33535653352737427 - f1-score (micro avg) 0.0368 |
|
2023-10-19 23:57:52,803 saving best model |
|
2023-10-19 23:57:52,832 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:53,377 epoch 5 - iter 29/292 - loss 0.45846940 - time (sec): 0.54 - samples/sec: 9610.96 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:57:53,907 epoch 5 - iter 58/292 - loss 0.51705999 - time (sec): 1.07 - samples/sec: 9124.70 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 23:57:54,398 epoch 5 - iter 87/292 - loss 0.48221681 - time (sec): 1.57 - samples/sec: 8759.51 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 23:57:54,929 epoch 5 - iter 116/292 - loss 0.46910511 - time (sec): 2.10 - samples/sec: 8518.22 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 23:57:55,498 epoch 5 - iter 145/292 - loss 0.46148846 - time (sec): 2.67 - samples/sec: 8533.84 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:57:56,036 epoch 5 - iter 174/292 - loss 0.44684775 - time (sec): 3.20 - samples/sec: 8422.22 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:57:56,551 epoch 5 - iter 203/292 - loss 0.44853019 - time (sec): 3.72 - samples/sec: 8311.04 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:57:57,125 epoch 5 - iter 232/292 - loss 0.46105102 - time (sec): 4.29 - samples/sec: 8270.59 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:57:57,639 epoch 5 - iter 261/292 - loss 0.45781694 - time (sec): 4.81 - samples/sec: 8194.08 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:57:58,166 epoch 5 - iter 290/292 - loss 0.44694408 - time (sec): 5.33 - samples/sec: 8314.90 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:57:58,196 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:58,196 EPOCH 5 done: loss 0.4463 - lr: 0.000017 |
|
2023-10-19 23:57:58,831 DEV : loss 0.33983153104782104 - f1-score (micro avg) 0.0687 |
|
2023-10-19 23:57:58,835 saving best model |
|
2023-10-19 23:57:58,868 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:57:59,383 epoch 6 - iter 29/292 - loss 0.44689006 - time (sec): 0.51 - samples/sec: 9067.63 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:57:59,890 epoch 6 - iter 58/292 - loss 0.42022345 - time (sec): 1.02 - samples/sec: 8248.87 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:58:00,381 epoch 6 - iter 87/292 - loss 0.43547237 - time (sec): 1.51 - samples/sec: 8374.78 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:58:00,902 epoch 6 - iter 116/292 - loss 0.42767037 - time (sec): 2.03 - samples/sec: 8496.53 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:58:01,413 epoch 6 - iter 145/292 - loss 0.41360136 - time (sec): 2.54 - samples/sec: 8648.10 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:58:01,950 epoch 6 - iter 174/292 - loss 0.41434310 - time (sec): 3.08 - samples/sec: 8717.91 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:58:02,459 epoch 6 - iter 203/292 - loss 0.40041453 - time (sec): 3.59 - samples/sec: 8764.22 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 23:58:02,969 epoch 6 - iter 232/292 - loss 0.40130692 - time (sec): 4.10 - samples/sec: 8773.71 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 23:58:03,485 epoch 6 - iter 261/292 - loss 0.40438520 - time (sec): 4.62 - samples/sec: 8665.74 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 23:58:04,000 epoch 6 - iter 290/292 - loss 0.41311196 - time (sec): 5.13 - samples/sec: 8614.21 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 23:58:04,028 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:04,028 EPOCH 6 done: loss 0.4149 - lr: 0.000013 |
|
2023-10-19 23:58:04,664 DEV : loss 0.327802449464798 - f1-score (micro avg) 0.1477 |
|
2023-10-19 23:58:04,668 saving best model |
|
2023-10-19 23:58:04,701 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:05,249 epoch 7 - iter 29/292 - loss 0.32104118 - time (sec): 0.55 - samples/sec: 10152.81 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 23:58:05,738 epoch 7 - iter 58/292 - loss 0.39113470 - time (sec): 1.04 - samples/sec: 9059.74 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 23:58:06,250 epoch 7 - iter 87/292 - loss 0.40749106 - time (sec): 1.55 - samples/sec: 8684.18 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:58:06,752 epoch 7 - iter 116/292 - loss 0.38645042 - time (sec): 2.05 - samples/sec: 8722.49 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:58:07,237 epoch 7 - iter 145/292 - loss 0.39391474 - time (sec): 2.53 - samples/sec: 8539.01 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:58:07,756 epoch 7 - iter 174/292 - loss 0.40980659 - time (sec): 3.05 - samples/sec: 8761.86 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 23:58:08,271 epoch 7 - iter 203/292 - loss 0.40209976 - time (sec): 3.57 - samples/sec: 8856.92 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 23:58:08,783 epoch 7 - iter 232/292 - loss 0.40744235 - time (sec): 4.08 - samples/sec: 8804.31 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 23:58:09,293 epoch 7 - iter 261/292 - loss 0.39712786 - time (sec): 4.59 - samples/sec: 8721.50 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:58:09,787 epoch 7 - iter 290/292 - loss 0.39302474 - time (sec): 5.09 - samples/sec: 8678.71 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:58:09,819 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:09,819 EPOCH 7 done: loss 0.3938 - lr: 0.000010 |
|
2023-10-19 23:58:10,451 DEV : loss 0.31045615673065186 - f1-score (micro avg) 0.1753 |
|
2023-10-19 23:58:10,455 saving best model |
|
2023-10-19 23:58:10,487 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:10,978 epoch 8 - iter 29/292 - loss 0.36499034 - time (sec): 0.49 - samples/sec: 8858.41 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:58:11,501 epoch 8 - iter 58/292 - loss 0.39731528 - time (sec): 1.01 - samples/sec: 9008.12 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 23:58:12,045 epoch 8 - iter 87/292 - loss 0.35849483 - time (sec): 1.56 - samples/sec: 9379.00 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 23:58:12,536 epoch 8 - iter 116/292 - loss 0.36888550 - time (sec): 2.05 - samples/sec: 8977.23 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 23:58:13,008 epoch 8 - iter 145/292 - loss 0.37562375 - time (sec): 2.52 - samples/sec: 8673.85 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 23:58:13,513 epoch 8 - iter 174/292 - loss 0.38262840 - time (sec): 3.03 - samples/sec: 8619.98 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 23:58:14,029 epoch 8 - iter 203/292 - loss 0.37658732 - time (sec): 3.54 - samples/sec: 8551.23 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 23:58:14,533 epoch 8 - iter 232/292 - loss 0.38303644 - time (sec): 4.05 - samples/sec: 8498.09 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 23:58:15,048 epoch 8 - iter 261/292 - loss 0.37867777 - time (sec): 4.56 - samples/sec: 8501.41 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 23:58:15,578 epoch 8 - iter 290/292 - loss 0.39456462 - time (sec): 5.09 - samples/sec: 8667.21 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 23:58:15,611 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:15,611 EPOCH 8 done: loss 0.3925 - lr: 0.000007 |
|
2023-10-19 23:58:16,250 DEV : loss 0.31813567876815796 - f1-score (micro avg) 0.1717 |
|
2023-10-19 23:58:16,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:16,758 epoch 9 - iter 29/292 - loss 0.40428918 - time (sec): 0.50 - samples/sec: 7980.78 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:58:17,287 epoch 9 - iter 58/292 - loss 0.36836772 - time (sec): 1.03 - samples/sec: 7894.99 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:58:17,850 epoch 9 - iter 87/292 - loss 0.37138841 - time (sec): 1.60 - samples/sec: 8060.88 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:58:18,362 epoch 9 - iter 116/292 - loss 0.35584352 - time (sec): 2.11 - samples/sec: 8099.67 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:58:18,884 epoch 9 - iter 145/292 - loss 0.36577051 - time (sec): 2.63 - samples/sec: 7966.11 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:58:19,552 epoch 9 - iter 174/292 - loss 0.36762424 - time (sec): 3.30 - samples/sec: 7731.22 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:58:20,069 epoch 9 - iter 203/292 - loss 0.36914510 - time (sec): 3.82 - samples/sec: 7877.82 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 23:58:20,623 epoch 9 - iter 232/292 - loss 0.38112081 - time (sec): 4.37 - samples/sec: 8101.40 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 23:58:21,126 epoch 9 - iter 261/292 - loss 0.37727984 - time (sec): 4.87 - samples/sec: 8146.24 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 23:58:21,648 epoch 9 - iter 290/292 - loss 0.38320637 - time (sec): 5.39 - samples/sec: 8211.37 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:58:21,676 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:21,676 EPOCH 9 done: loss 0.3842 - lr: 0.000003 |
|
2023-10-19 23:58:22,306 DEV : loss 0.31648480892181396 - f1-score (micro avg) 0.1803 |
|
2023-10-19 23:58:22,310 saving best model |
|
2023-10-19 23:58:22,343 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:22,869 epoch 10 - iter 29/292 - loss 0.31764849 - time (sec): 0.53 - samples/sec: 9812.80 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:58:23,401 epoch 10 - iter 58/292 - loss 0.35718997 - time (sec): 1.06 - samples/sec: 10149.09 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:58:23,875 epoch 10 - iter 87/292 - loss 0.37047487 - time (sec): 1.53 - samples/sec: 9426.05 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 23:58:24,360 epoch 10 - iter 116/292 - loss 0.36595183 - time (sec): 2.02 - samples/sec: 9236.35 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 23:58:24,860 epoch 10 - iter 145/292 - loss 0.36830394 - time (sec): 2.52 - samples/sec: 8935.26 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 23:58:25,373 epoch 10 - iter 174/292 - loss 0.37044997 - time (sec): 3.03 - samples/sec: 8789.73 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 23:58:25,857 epoch 10 - iter 203/292 - loss 0.36942048 - time (sec): 3.51 - samples/sec: 8685.67 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 23:58:26,366 epoch 10 - iter 232/292 - loss 0.37678363 - time (sec): 4.02 - samples/sec: 8658.91 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 23:58:26,886 epoch 10 - iter 261/292 - loss 0.37899069 - time (sec): 4.54 - samples/sec: 8531.89 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 23:58:27,446 epoch 10 - iter 290/292 - loss 0.37973279 - time (sec): 5.10 - samples/sec: 8676.83 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 23:58:27,478 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:27,478 EPOCH 10 done: loss 0.3791 - lr: 0.000000 |
|
2023-10-19 23:58:28,125 DEV : loss 0.31487029790878296 - f1-score (micro avg) 0.1848 |
|
2023-10-19 23:58:28,129 saving best model |
|
2023-10-19 23:58:28,189 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:28,190 Loading model from best epoch ... |
|
2023-10-19 23:58:28,270 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-19 23:58:29,173 |
|
Results: |
|
- F-score (micro) 0.2928 |
|
- F-score (macro) 0.1505 |
|
- Accuracy 0.1777 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.3594 0.3563 0.3579 348 |
|
LOC 0.2658 0.2261 0.2443 261 |
|
ORG 0.0000 0.0000 0.0000 52 |
|
HumanProd 0.0000 0.0000 0.0000 22 |
|
|
|
micro avg 0.3228 0.2679 0.2928 683 |
|
macro avg 0.1563 0.1456 0.1505 683 |
|
weighted avg 0.2847 0.2679 0.2757 683 |
|
|
|
2023-10-19 23:58:29,173 ---------------------------------------------------------------------------------------------------- |
|
|