|
2023-10-19 23:54:46,265 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:54:46,265 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-19 23:54:46,265 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:54:46,265 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-19 23:54:46,265 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:54:46,265 Train: 1166 sentences |
|
2023-10-19 23:54:46,265 (train_with_dev=False, train_with_test=False) |
|
2023-10-19 23:54:46,265 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:54:46,265 Training Params: |
|
2023-10-19 23:54:46,265 - learning_rate: "5e-05" |
|
2023-10-19 23:54:46,265 - mini_batch_size: "4" |
|
2023-10-19 23:54:46,265 - max_epochs: "10" |
|
2023-10-19 23:54:46,265 - shuffle: "True" |
|
2023-10-19 23:54:46,265 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:54:46,265 Plugins: |
|
2023-10-19 23:54:46,265 - TensorboardLogger |
|
2023-10-19 23:54:46,265 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-19 23:54:46,265 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:54:46,266 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-19 23:54:46,266 - metric: "('micro avg', 'f1-score')" |
|
2023-10-19 23:54:46,266 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:54:46,266 Computation: |
|
2023-10-19 23:54:46,266 - compute on device: cuda:0 |
|
2023-10-19 23:54:46,266 - embedding storage: none |
|
2023-10-19 23:54:46,266 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:54:46,266 Model training base path: "hmbench-newseye/fi-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-19 23:54:46,266 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:54:46,266 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:54:46,266 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-19 23:54:46,793 epoch 1 - iter 29/292 - loss 3.55752914 - time (sec): 0.53 - samples/sec: 7908.79 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:54:47,324 epoch 1 - iter 58/292 - loss 3.51797914 - time (sec): 1.06 - samples/sec: 8808.85 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:54:47,833 epoch 1 - iter 87/292 - loss 3.38770600 - time (sec): 1.57 - samples/sec: 8612.55 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:54:48,346 epoch 1 - iter 116/292 - loss 3.15720938 - time (sec): 2.08 - samples/sec: 8491.67 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:54:48,870 epoch 1 - iter 145/292 - loss 2.91319087 - time (sec): 2.60 - samples/sec: 8456.38 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:54:49,365 epoch 1 - iter 174/292 - loss 2.68788876 - time (sec): 3.10 - samples/sec: 8374.19 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 23:54:49,915 epoch 1 - iter 203/292 - loss 2.45249775 - time (sec): 3.65 - samples/sec: 8479.75 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 23:54:50,473 epoch 1 - iter 232/292 - loss 2.21945254 - time (sec): 4.21 - samples/sec: 8501.77 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 23:54:50,953 epoch 1 - iter 261/292 - loss 2.05466586 - time (sec): 4.69 - samples/sec: 8518.18 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 23:54:51,428 epoch 1 - iter 290/292 - loss 1.91851237 - time (sec): 5.16 - samples/sec: 8529.52 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 23:54:51,462 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:54:51,462 EPOCH 1 done: loss 1.9040 - lr: 0.000049 |
|
2023-10-19 23:54:51,728 DEV : loss 0.45997393131256104 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:54:51,732 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:54:52,162 epoch 2 - iter 29/292 - loss 0.58656023 - time (sec): 0.43 - samples/sec: 7909.71 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 23:54:52,603 epoch 2 - iter 58/292 - loss 0.63439941 - time (sec): 0.87 - samples/sec: 9252.63 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 23:54:53,049 epoch 2 - iter 87/292 - loss 0.63149781 - time (sec): 1.32 - samples/sec: 9293.51 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-19 23:54:53,493 epoch 2 - iter 116/292 - loss 0.62294731 - time (sec): 1.76 - samples/sec: 9411.80 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-19 23:54:53,982 epoch 2 - iter 145/292 - loss 0.67190983 - time (sec): 2.25 - samples/sec: 9782.63 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-19 23:54:54,587 epoch 2 - iter 174/292 - loss 0.65041157 - time (sec): 2.85 - samples/sec: 9342.32 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-19 23:54:55,159 epoch 2 - iter 203/292 - loss 0.62124438 - time (sec): 3.43 - samples/sec: 9211.13 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-19 23:54:55,714 epoch 2 - iter 232/292 - loss 0.61325278 - time (sec): 3.98 - samples/sec: 8902.60 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-19 23:54:56,273 epoch 2 - iter 261/292 - loss 0.60750010 - time (sec): 4.54 - samples/sec: 8763.90 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 23:54:56,780 epoch 2 - iter 290/292 - loss 0.59371287 - time (sec): 5.05 - samples/sec: 8713.85 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 23:54:56,811 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:54:56,811 EPOCH 2 done: loss 0.5942 - lr: 0.000045 |
|
2023-10-19 23:54:57,612 DEV : loss 0.34521862864494324 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:54:57,616 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:54:58,150 epoch 3 - iter 29/292 - loss 0.43874260 - time (sec): 0.53 - samples/sec: 8885.33 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-19 23:54:58,660 epoch 3 - iter 58/292 - loss 0.45959734 - time (sec): 1.04 - samples/sec: 8864.19 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-19 23:54:59,178 epoch 3 - iter 87/292 - loss 0.46972901 - time (sec): 1.56 - samples/sec: 8336.04 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-19 23:54:59,714 epoch 3 - iter 116/292 - loss 0.45936071 - time (sec): 2.10 - samples/sec: 8357.63 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-19 23:55:00,257 epoch 3 - iter 145/292 - loss 0.48567862 - time (sec): 2.64 - samples/sec: 8187.12 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-19 23:55:00,798 epoch 3 - iter 174/292 - loss 0.47727066 - time (sec): 3.18 - samples/sec: 8207.62 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-19 23:55:01,351 epoch 3 - iter 203/292 - loss 0.48496424 - time (sec): 3.73 - samples/sec: 8427.15 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-19 23:55:01,873 epoch 3 - iter 232/292 - loss 0.48084606 - time (sec): 4.26 - samples/sec: 8398.02 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 23:55:02,397 epoch 3 - iter 261/292 - loss 0.47958169 - time (sec): 4.78 - samples/sec: 8305.75 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 23:55:02,917 epoch 3 - iter 290/292 - loss 0.47094732 - time (sec): 5.30 - samples/sec: 8310.97 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-19 23:55:02,952 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:55:02,953 EPOCH 3 done: loss 0.4685 - lr: 0.000039 |
|
2023-10-19 23:55:03,596 DEV : loss 0.31381186842918396 - f1-score (micro avg) 0.1303 |
|
2023-10-19 23:55:03,600 saving best model |
|
2023-10-19 23:55:03,629 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:55:04,156 epoch 4 - iter 29/292 - loss 0.43177072 - time (sec): 0.53 - samples/sec: 8615.07 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-19 23:55:04,683 epoch 4 - iter 58/292 - loss 0.41853015 - time (sec): 1.05 - samples/sec: 8960.14 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-19 23:55:05,216 epoch 4 - iter 87/292 - loss 0.39904768 - time (sec): 1.59 - samples/sec: 8975.40 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-19 23:55:05,693 epoch 4 - iter 116/292 - loss 0.39486048 - time (sec): 2.06 - samples/sec: 8683.45 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-19 23:55:06,182 epoch 4 - iter 145/292 - loss 0.38926525 - time (sec): 2.55 - samples/sec: 8552.06 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-19 23:55:06,693 epoch 4 - iter 174/292 - loss 0.38841178 - time (sec): 3.06 - samples/sec: 8439.81 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-19 23:55:07,187 epoch 4 - iter 203/292 - loss 0.38717609 - time (sec): 3.56 - samples/sec: 8323.23 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 23:55:07,715 epoch 4 - iter 232/292 - loss 0.39343263 - time (sec): 4.09 - samples/sec: 8440.52 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 23:55:08,255 epoch 4 - iter 261/292 - loss 0.41230712 - time (sec): 4.63 - samples/sec: 8572.69 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-19 23:55:08,816 epoch 4 - iter 290/292 - loss 0.41910163 - time (sec): 5.19 - samples/sec: 8487.60 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-19 23:55:08,856 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:55:08,856 EPOCH 4 done: loss 0.4154 - lr: 0.000033 |
|
2023-10-19 23:55:09,491 DEV : loss 0.3016578257083893 - f1-score (micro avg) 0.2328 |
|
2023-10-19 23:55:09,495 saving best model |
|
2023-10-19 23:55:09,530 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:55:10,074 epoch 5 - iter 29/292 - loss 0.43229368 - time (sec): 0.54 - samples/sec: 8265.43 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-19 23:55:10,601 epoch 5 - iter 58/292 - loss 0.37539253 - time (sec): 1.07 - samples/sec: 8654.01 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-19 23:55:11,118 epoch 5 - iter 87/292 - loss 0.40406274 - time (sec): 1.59 - samples/sec: 8605.36 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-19 23:55:11,636 epoch 5 - iter 116/292 - loss 0.40222319 - time (sec): 2.11 - samples/sec: 8412.63 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-19 23:55:12,149 epoch 5 - iter 145/292 - loss 0.39690686 - time (sec): 2.62 - samples/sec: 8624.99 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-19 23:55:12,669 epoch 5 - iter 174/292 - loss 0.39595604 - time (sec): 3.14 - samples/sec: 8461.59 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 23:55:13,173 epoch 5 - iter 203/292 - loss 0.39014674 - time (sec): 3.64 - samples/sec: 8592.12 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 23:55:13,667 epoch 5 - iter 232/292 - loss 0.39145583 - time (sec): 4.14 - samples/sec: 8507.40 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 23:55:14,174 epoch 5 - iter 261/292 - loss 0.38172039 - time (sec): 4.64 - samples/sec: 8575.01 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:55:14,665 epoch 5 - iter 290/292 - loss 0.37542897 - time (sec): 5.13 - samples/sec: 8597.45 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:55:14,700 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:55:14,700 EPOCH 5 done: loss 0.3775 - lr: 0.000028 |
|
2023-10-19 23:55:15,336 DEV : loss 0.2973732054233551 - f1-score (micro avg) 0.2654 |
|
2023-10-19 23:55:15,340 saving best model |
|
2023-10-19 23:55:15,372 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:55:15,873 epoch 6 - iter 29/292 - loss 0.37312102 - time (sec): 0.50 - samples/sec: 9364.57 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:55:16,391 epoch 6 - iter 58/292 - loss 0.36942596 - time (sec): 1.02 - samples/sec: 8762.88 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:55:16,914 epoch 6 - iter 87/292 - loss 0.34805366 - time (sec): 1.54 - samples/sec: 8399.49 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:55:17,447 epoch 6 - iter 116/292 - loss 0.37301859 - time (sec): 2.07 - samples/sec: 8795.53 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:55:17,968 epoch 6 - iter 145/292 - loss 0.38869681 - time (sec): 2.60 - samples/sec: 8817.51 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:55:18,485 epoch 6 - iter 174/292 - loss 0.36418246 - time (sec): 3.11 - samples/sec: 8987.93 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:55:19,005 epoch 6 - iter 203/292 - loss 0.36949889 - time (sec): 3.63 - samples/sec: 8807.31 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:55:19,515 epoch 6 - iter 232/292 - loss 0.36160845 - time (sec): 4.14 - samples/sec: 8753.61 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:55:20,009 epoch 6 - iter 261/292 - loss 0.36352966 - time (sec): 4.64 - samples/sec: 8640.60 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:55:20,507 epoch 6 - iter 290/292 - loss 0.35857782 - time (sec): 5.13 - samples/sec: 8588.66 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:55:20,538 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:55:20,538 EPOCH 6 done: loss 0.3575 - lr: 0.000022 |
|
2023-10-19 23:55:21,183 DEV : loss 0.29466673731803894 - f1-score (micro avg) 0.2937 |
|
2023-10-19 23:55:21,187 saving best model |
|
2023-10-19 23:55:21,221 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:55:21,729 epoch 7 - iter 29/292 - loss 0.38819137 - time (sec): 0.51 - samples/sec: 8190.47 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:55:22,263 epoch 7 - iter 58/292 - loss 0.34242214 - time (sec): 1.04 - samples/sec: 8732.14 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:55:22,798 epoch 7 - iter 87/292 - loss 0.35717419 - time (sec): 1.58 - samples/sec: 8812.53 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:55:23,307 epoch 7 - iter 116/292 - loss 0.37602817 - time (sec): 2.09 - samples/sec: 8721.64 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:55:23,820 epoch 7 - iter 145/292 - loss 0.36879662 - time (sec): 2.60 - samples/sec: 8665.03 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:55:24,342 epoch 7 - iter 174/292 - loss 0.36046721 - time (sec): 3.12 - samples/sec: 8544.70 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 23:55:24,835 epoch 7 - iter 203/292 - loss 0.35584894 - time (sec): 3.61 - samples/sec: 8468.06 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:55:25,332 epoch 7 - iter 232/292 - loss 0.35765109 - time (sec): 4.11 - samples/sec: 8520.60 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:55:25,837 epoch 7 - iter 261/292 - loss 0.34589803 - time (sec): 4.62 - samples/sec: 8620.32 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:55:26,370 epoch 7 - iter 290/292 - loss 0.33965089 - time (sec): 5.15 - samples/sec: 8590.24 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:55:26,397 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:55:26,397 EPOCH 7 done: loss 0.3397 - lr: 0.000017 |
|
2023-10-19 23:55:27,049 DEV : loss 0.29104095697402954 - f1-score (micro avg) 0.3193 |
|
2023-10-19 23:55:27,052 saving best model |
|
2023-10-19 23:55:27,085 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:55:27,615 epoch 8 - iter 29/292 - loss 0.29154439 - time (sec): 0.53 - samples/sec: 8832.63 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:55:28,142 epoch 8 - iter 58/292 - loss 0.34250586 - time (sec): 1.06 - samples/sec: 9218.94 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:55:28,629 epoch 8 - iter 87/292 - loss 0.32930383 - time (sec): 1.54 - samples/sec: 8830.03 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:55:29,160 epoch 8 - iter 116/292 - loss 0.32840663 - time (sec): 2.07 - samples/sec: 8764.31 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:55:29,636 epoch 8 - iter 145/292 - loss 0.32284264 - time (sec): 2.55 - samples/sec: 8542.07 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 23:55:30,078 epoch 8 - iter 174/292 - loss 0.32209124 - time (sec): 2.99 - samples/sec: 8424.20 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 23:55:30,578 epoch 8 - iter 203/292 - loss 0.33155736 - time (sec): 3.49 - samples/sec: 8675.07 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 23:55:31,241 epoch 8 - iter 232/292 - loss 0.32136242 - time (sec): 4.16 - samples/sec: 8525.82 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:55:31,714 epoch 8 - iter 261/292 - loss 0.32137603 - time (sec): 4.63 - samples/sec: 8544.97 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:55:32,222 epoch 8 - iter 290/292 - loss 0.32337631 - time (sec): 5.14 - samples/sec: 8605.69 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 23:55:32,257 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:55:32,257 EPOCH 8 done: loss 0.3226 - lr: 0.000011 |
|
2023-10-19 23:55:32,918 DEV : loss 0.29328519105911255 - f1-score (micro avg) 0.3129 |
|
2023-10-19 23:55:32,923 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:55:33,419 epoch 9 - iter 29/292 - loss 0.29822543 - time (sec): 0.50 - samples/sec: 8134.79 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 23:55:33,914 epoch 9 - iter 58/292 - loss 0.34739399 - time (sec): 0.99 - samples/sec: 8415.03 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:55:34,382 epoch 9 - iter 87/292 - loss 0.34634734 - time (sec): 1.46 - samples/sec: 8010.70 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:55:34,892 epoch 9 - iter 116/292 - loss 0.34386116 - time (sec): 1.97 - samples/sec: 8071.38 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 23:55:35,411 epoch 9 - iter 145/292 - loss 0.34301480 - time (sec): 2.49 - samples/sec: 8269.81 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 23:55:35,926 epoch 9 - iter 174/292 - loss 0.33151984 - time (sec): 3.00 - samples/sec: 8533.95 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 23:55:36,436 epoch 9 - iter 203/292 - loss 0.33282371 - time (sec): 3.51 - samples/sec: 8455.15 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 23:55:36,974 epoch 9 - iter 232/292 - loss 0.33502228 - time (sec): 4.05 - samples/sec: 8611.47 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 23:55:37,492 epoch 9 - iter 261/292 - loss 0.32271061 - time (sec): 4.57 - samples/sec: 8680.15 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:55:38,050 epoch 9 - iter 290/292 - loss 0.31960374 - time (sec): 5.13 - samples/sec: 8640.65 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:55:38,079 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:55:38,079 EPOCH 9 done: loss 0.3192 - lr: 0.000006 |
|
2023-10-19 23:55:38,728 DEV : loss 0.2915344834327698 - f1-score (micro avg) 0.3067 |
|
2023-10-19 23:55:38,731 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:55:39,234 epoch 10 - iter 29/292 - loss 0.25691072 - time (sec): 0.50 - samples/sec: 9550.60 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:55:39,752 epoch 10 - iter 58/292 - loss 0.32264057 - time (sec): 1.02 - samples/sec: 9710.22 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:55:40,262 epoch 10 - iter 87/292 - loss 0.29381973 - time (sec): 1.53 - samples/sec: 9342.81 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 23:55:40,745 epoch 10 - iter 116/292 - loss 0.29276188 - time (sec): 2.01 - samples/sec: 9055.61 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:55:41,215 epoch 10 - iter 145/292 - loss 0.30739420 - time (sec): 2.48 - samples/sec: 8724.85 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:55:41,719 epoch 10 - iter 174/292 - loss 0.30017946 - time (sec): 2.99 - samples/sec: 8931.07 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 23:55:42,236 epoch 10 - iter 203/292 - loss 0.29891419 - time (sec): 3.50 - samples/sec: 8834.20 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 23:55:42,768 epoch 10 - iter 232/292 - loss 0.30792335 - time (sec): 4.04 - samples/sec: 8901.96 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 23:55:43,258 epoch 10 - iter 261/292 - loss 0.31694630 - time (sec): 4.53 - samples/sec: 8787.52 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 23:55:43,793 epoch 10 - iter 290/292 - loss 0.31667346 - time (sec): 5.06 - samples/sec: 8740.26 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 23:55:43,823 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:55:43,823 EPOCH 10 done: loss 0.3158 - lr: 0.000000 |
|
2023-10-19 23:55:44,474 DEV : loss 0.2926194965839386 - f1-score (micro avg) 0.307 |
|
2023-10-19 23:55:44,506 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:55:44,507 Loading model from best epoch ... |
|
2023-10-19 23:55:44,580 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-19 23:55:45,486 |
|
Results: |
|
- F-score (micro) 0.3714 |
|
- F-score (macro) 0.196 |
|
- Accuracy 0.237 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.3965 0.3908 0.3936 348 |
|
LOC 0.3316 0.4751 0.3906 261 |
|
ORG 0.0000 0.0000 0.0000 52 |
|
HumanProd 0.0000 0.0000 0.0000 22 |
|
|
|
micro avg 0.3626 0.3807 0.3714 683 |
|
macro avg 0.1820 0.2165 0.1960 683 |
|
weighted avg 0.3287 0.3807 0.3498 683 |
|
|
|
2023-10-19 23:55:45,486 ---------------------------------------------------------------------------------------------------- |
|
|