|
2023-10-19 23:58:32,450 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:32,451 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-19 23:58:32,451 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:32,451 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-19 23:58:32,451 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:32,451 Train: 1166 sentences |
|
2023-10-19 23:58:32,451 (train_with_dev=False, train_with_test=False) |
|
2023-10-19 23:58:32,451 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:32,451 Training Params: |
|
2023-10-19 23:58:32,451 - learning_rate: "5e-05" |
|
2023-10-19 23:58:32,451 - mini_batch_size: "4" |
|
2023-10-19 23:58:32,451 - max_epochs: "10" |
|
2023-10-19 23:58:32,451 - shuffle: "True" |
|
2023-10-19 23:58:32,451 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:32,451 Plugins: |
|
2023-10-19 23:58:32,451 - TensorboardLogger |
|
2023-10-19 23:58:32,451 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-19 23:58:32,451 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:32,451 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-19 23:58:32,451 - metric: "('micro avg', 'f1-score')" |
|
2023-10-19 23:58:32,451 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:32,451 Computation: |
|
2023-10-19 23:58:32,451 - compute on device: cuda:0 |
|
2023-10-19 23:58:32,451 - embedding storage: none |
|
2023-10-19 23:58:32,452 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:32,452 Model training base path: "hmbench-newseye/fi-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-19 23:58:32,452 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:32,452 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:32,452 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-19 23:58:32,922 epoch 1 - iter 29/292 - loss 3.14779499 - time (sec): 0.47 - samples/sec: 8417.41 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:58:33,371 epoch 1 - iter 58/292 - loss 3.07959948 - time (sec): 0.92 - samples/sec: 8375.04 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:58:33,854 epoch 1 - iter 87/292 - loss 2.99636841 - time (sec): 1.40 - samples/sec: 8658.66 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:58:34,361 epoch 1 - iter 116/292 - loss 2.81136763 - time (sec): 1.91 - samples/sec: 8448.44 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:58:34,874 epoch 1 - iter 145/292 - loss 2.57856572 - time (sec): 2.42 - samples/sec: 8523.19 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:58:35,405 epoch 1 - iter 174/292 - loss 2.34135925 - time (sec): 2.95 - samples/sec: 8378.57 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 23:58:35,938 epoch 1 - iter 203/292 - loss 2.04834995 - time (sec): 3.49 - samples/sec: 8631.55 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 23:58:36,449 epoch 1 - iter 232/292 - loss 1.87569980 - time (sec): 4.00 - samples/sec: 8640.24 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 23:58:36,995 epoch 1 - iter 261/292 - loss 1.72976459 - time (sec): 4.54 - samples/sec: 8700.50 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 23:58:37,512 epoch 1 - iter 290/292 - loss 1.62662447 - time (sec): 5.06 - samples/sec: 8755.13 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 23:58:37,540 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:37,540 EPOCH 1 done: loss 1.6226 - lr: 0.000049 |
|
2023-10-19 23:58:37,807 DEV : loss 0.46307969093322754 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:58:37,811 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:38,333 epoch 2 - iter 29/292 - loss 0.82866034 - time (sec): 0.52 - samples/sec: 9421.74 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 23:58:38,857 epoch 2 - iter 58/292 - loss 0.77057818 - time (sec): 1.05 - samples/sec: 9203.30 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 23:58:39,364 epoch 2 - iter 87/292 - loss 0.72915775 - time (sec): 1.55 - samples/sec: 8858.99 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-19 23:58:39,867 epoch 2 - iter 116/292 - loss 0.72394303 - time (sec): 2.06 - samples/sec: 8546.65 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-19 23:58:40,382 epoch 2 - iter 145/292 - loss 0.69723734 - time (sec): 2.57 - samples/sec: 8424.58 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-19 23:58:40,892 epoch 2 - iter 174/292 - loss 0.67216825 - time (sec): 3.08 - samples/sec: 8432.93 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-19 23:58:41,418 epoch 2 - iter 203/292 - loss 0.65219976 - time (sec): 3.61 - samples/sec: 8329.25 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-19 23:58:41,947 epoch 2 - iter 232/292 - loss 0.61740936 - time (sec): 4.14 - samples/sec: 8592.86 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-19 23:58:42,474 epoch 2 - iter 261/292 - loss 0.60180613 - time (sec): 4.66 - samples/sec: 8681.15 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 23:58:42,967 epoch 2 - iter 290/292 - loss 0.60028304 - time (sec): 5.16 - samples/sec: 8555.21 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 23:58:42,998 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:42,999 EPOCH 2 done: loss 0.5990 - lr: 0.000045 |
|
2023-10-19 23:58:43,626 DEV : loss 0.3582861125469208 - f1-score (micro avg) 0.0085 |
|
2023-10-19 23:58:43,630 saving best model |
|
2023-10-19 23:58:43,661 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:44,166 epoch 3 - iter 29/292 - loss 0.42071781 - time (sec): 0.50 - samples/sec: 8686.20 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-19 23:58:44,684 epoch 3 - iter 58/292 - loss 0.43984250 - time (sec): 1.02 - samples/sec: 8569.64 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-19 23:58:45,207 epoch 3 - iter 87/292 - loss 0.46328548 - time (sec): 1.55 - samples/sec: 8822.27 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-19 23:58:45,733 epoch 3 - iter 116/292 - loss 0.49552322 - time (sec): 2.07 - samples/sec: 8657.57 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-19 23:58:46,403 epoch 3 - iter 145/292 - loss 0.49277663 - time (sec): 2.74 - samples/sec: 8130.26 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-19 23:58:46,912 epoch 3 - iter 174/292 - loss 0.48425229 - time (sec): 3.25 - samples/sec: 8346.84 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-19 23:58:47,345 epoch 3 - iter 203/292 - loss 0.47920464 - time (sec): 3.68 - samples/sec: 8445.19 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-19 23:58:47,790 epoch 3 - iter 232/292 - loss 0.47196920 - time (sec): 4.13 - samples/sec: 8656.03 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 23:58:48,225 epoch 3 - iter 261/292 - loss 0.46824817 - time (sec): 4.56 - samples/sec: 8662.99 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 23:58:48,718 epoch 3 - iter 290/292 - loss 0.46590176 - time (sec): 5.06 - samples/sec: 8723.29 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-19 23:58:48,759 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:48,759 EPOCH 3 done: loss 0.4648 - lr: 0.000039 |
|
2023-10-19 23:58:49,392 DEV : loss 0.3454797565937042 - f1-score (micro avg) 0.0561 |
|
2023-10-19 23:58:49,396 saving best model |
|
2023-10-19 23:58:49,434 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:49,923 epoch 4 - iter 29/292 - loss 0.37633271 - time (sec): 0.49 - samples/sec: 8019.09 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-19 23:58:50,428 epoch 4 - iter 58/292 - loss 0.39081668 - time (sec): 0.99 - samples/sec: 8129.96 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-19 23:58:50,948 epoch 4 - iter 87/292 - loss 0.38739784 - time (sec): 1.51 - samples/sec: 8446.57 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-19 23:58:51,445 epoch 4 - iter 116/292 - loss 0.38345280 - time (sec): 2.01 - samples/sec: 8461.51 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-19 23:58:51,973 epoch 4 - iter 145/292 - loss 0.38672167 - time (sec): 2.54 - samples/sec: 8359.16 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-19 23:58:52,487 epoch 4 - iter 174/292 - loss 0.39123246 - time (sec): 3.05 - samples/sec: 8332.83 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-19 23:58:53,015 epoch 4 - iter 203/292 - loss 0.40684055 - time (sec): 3.58 - samples/sec: 8599.45 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 23:58:53,514 epoch 4 - iter 232/292 - loss 0.40715733 - time (sec): 4.08 - samples/sec: 8461.76 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 23:58:54,020 epoch 4 - iter 261/292 - loss 0.40395442 - time (sec): 4.58 - samples/sec: 8452.62 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-19 23:58:54,583 epoch 4 - iter 290/292 - loss 0.40765257 - time (sec): 5.15 - samples/sec: 8610.09 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-19 23:58:54,610 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:54,611 EPOCH 4 done: loss 0.4075 - lr: 0.000033 |
|
2023-10-19 23:58:55,258 DEV : loss 0.31024691462516785 - f1-score (micro avg) 0.2088 |
|
2023-10-19 23:58:55,262 saving best model |
|
2023-10-19 23:58:55,296 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:58:55,802 epoch 5 - iter 29/292 - loss 0.38537138 - time (sec): 0.51 - samples/sec: 10355.27 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-19 23:58:56,323 epoch 5 - iter 58/292 - loss 0.43876506 - time (sec): 1.03 - samples/sec: 9554.92 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-19 23:58:56,823 epoch 5 - iter 87/292 - loss 0.40671441 - time (sec): 1.53 - samples/sec: 8982.40 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-19 23:58:57,330 epoch 5 - iter 116/292 - loss 0.40007945 - time (sec): 2.03 - samples/sec: 8784.42 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-19 23:58:57,858 epoch 5 - iter 145/292 - loss 0.39158917 - time (sec): 2.56 - samples/sec: 8880.38 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-19 23:58:58,371 epoch 5 - iter 174/292 - loss 0.37921891 - time (sec): 3.07 - samples/sec: 8775.96 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 23:58:58,855 epoch 5 - iter 203/292 - loss 0.38011427 - time (sec): 3.56 - samples/sec: 8685.58 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 23:58:59,389 epoch 5 - iter 232/292 - loss 0.39053655 - time (sec): 4.09 - samples/sec: 8676.83 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 23:58:59,890 epoch 5 - iter 261/292 - loss 0.38870844 - time (sec): 4.59 - samples/sec: 8575.24 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:59:00,411 epoch 5 - iter 290/292 - loss 0.37911182 - time (sec): 5.11 - samples/sec: 8672.12 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:59:00,439 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:00,439 EPOCH 5 done: loss 0.3785 - lr: 0.000028 |
|
2023-10-19 23:59:01,076 DEV : loss 0.3215314745903015 - f1-score (micro avg) 0.2246 |
|
2023-10-19 23:59:01,080 saving best model |
|
2023-10-19 23:59:01,113 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:01,632 epoch 6 - iter 29/292 - loss 0.38224130 - time (sec): 0.52 - samples/sec: 9011.08 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:59:02,088 epoch 6 - iter 58/292 - loss 0.36144289 - time (sec): 0.97 - samples/sec: 8652.39 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:59:02,532 epoch 6 - iter 87/292 - loss 0.37097587 - time (sec): 1.42 - samples/sec: 8935.31 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:59:02,976 epoch 6 - iter 116/292 - loss 0.36235103 - time (sec): 1.86 - samples/sec: 9274.47 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:59:03,426 epoch 6 - iter 145/292 - loss 0.35004386 - time (sec): 2.31 - samples/sec: 9520.28 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:59:03,894 epoch 6 - iter 174/292 - loss 0.35006772 - time (sec): 2.78 - samples/sec: 9664.14 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:59:04,370 epoch 6 - iter 203/292 - loss 0.33807395 - time (sec): 3.26 - samples/sec: 9664.13 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:59:04,858 epoch 6 - iter 232/292 - loss 0.33896877 - time (sec): 3.74 - samples/sec: 9609.82 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:59:05,390 epoch 6 - iter 261/292 - loss 0.34191111 - time (sec): 4.28 - samples/sec: 9355.23 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:59:05,897 epoch 6 - iter 290/292 - loss 0.34950655 - time (sec): 4.78 - samples/sec: 9240.88 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:59:05,928 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:05,928 EPOCH 6 done: loss 0.3515 - lr: 0.000022 |
|
2023-10-19 23:59:06,570 DEV : loss 0.31194955110549927 - f1-score (micro avg) 0.2452 |
|
2023-10-19 23:59:06,574 saving best model |
|
2023-10-19 23:59:06,608 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:07,151 epoch 7 - iter 29/292 - loss 0.26753793 - time (sec): 0.54 - samples/sec: 10262.20 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:59:07,637 epoch 7 - iter 58/292 - loss 0.33094214 - time (sec): 1.03 - samples/sec: 9137.23 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:59:08,121 epoch 7 - iter 87/292 - loss 0.34593265 - time (sec): 1.51 - samples/sec: 8884.86 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:59:08,643 epoch 7 - iter 116/292 - loss 0.32828561 - time (sec): 2.03 - samples/sec: 8790.71 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:59:09,166 epoch 7 - iter 145/292 - loss 0.33425252 - time (sec): 2.56 - samples/sec: 8462.47 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:59:09,718 epoch 7 - iter 174/292 - loss 0.34544164 - time (sec): 3.11 - samples/sec: 8606.96 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 23:59:10,245 epoch 7 - iter 203/292 - loss 0.33684443 - time (sec): 3.64 - samples/sec: 8692.09 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:59:10,742 epoch 7 - iter 232/292 - loss 0.34217880 - time (sec): 4.13 - samples/sec: 8692.34 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:59:11,257 epoch 7 - iter 261/292 - loss 0.33355551 - time (sec): 4.65 - samples/sec: 8615.33 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:59:11,785 epoch 7 - iter 290/292 - loss 0.32942474 - time (sec): 5.18 - samples/sec: 8527.00 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:59:11,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:11,818 EPOCH 7 done: loss 0.3300 - lr: 0.000017 |
|
2023-10-19 23:59:12,459 DEV : loss 0.2952404320240021 - f1-score (micro avg) 0.2941 |
|
2023-10-19 23:59:12,463 saving best model |
|
2023-10-19 23:59:12,495 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:12,988 epoch 8 - iter 29/292 - loss 0.30565935 - time (sec): 0.49 - samples/sec: 8828.66 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:59:13,524 epoch 8 - iter 58/292 - loss 0.34043801 - time (sec): 1.03 - samples/sec: 8887.12 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:59:14,090 epoch 8 - iter 87/292 - loss 0.30212031 - time (sec): 1.59 - samples/sec: 9162.88 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:59:14,615 epoch 8 - iter 116/292 - loss 0.31110236 - time (sec): 2.12 - samples/sec: 8676.70 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:59:15,118 epoch 8 - iter 145/292 - loss 0.31665402 - time (sec): 2.62 - samples/sec: 8340.54 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 23:59:15,650 epoch 8 - iter 174/292 - loss 0.32087859 - time (sec): 3.15 - samples/sec: 8269.51 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 23:59:16,189 epoch 8 - iter 203/292 - loss 0.31580278 - time (sec): 3.69 - samples/sec: 8200.91 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 23:59:16,730 epoch 8 - iter 232/292 - loss 0.32171904 - time (sec): 4.23 - samples/sec: 8120.26 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:59:17,272 epoch 8 - iter 261/292 - loss 0.31735134 - time (sec): 4.78 - samples/sec: 8118.22 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:59:17,799 epoch 8 - iter 290/292 - loss 0.33042954 - time (sec): 5.30 - samples/sec: 8319.87 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 23:59:17,831 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:17,831 EPOCH 8 done: loss 0.3287 - lr: 0.000011 |
|
2023-10-19 23:59:18,470 DEV : loss 0.300225168466568 - f1-score (micro avg) 0.274 |
|
2023-10-19 23:59:18,474 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:18,946 epoch 9 - iter 29/292 - loss 0.33105087 - time (sec): 0.47 - samples/sec: 8533.89 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 23:59:19,435 epoch 9 - iter 58/292 - loss 0.30511678 - time (sec): 0.96 - samples/sec: 8494.20 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:59:19,961 epoch 9 - iter 87/292 - loss 0.30860155 - time (sec): 1.49 - samples/sec: 8654.05 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:59:20,608 epoch 9 - iter 116/292 - loss 0.29350432 - time (sec): 2.13 - samples/sec: 8002.46 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 23:59:21,087 epoch 9 - iter 145/292 - loss 0.30324049 - time (sec): 2.61 - samples/sec: 8017.85 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 23:59:21,607 epoch 9 - iter 174/292 - loss 0.30397607 - time (sec): 3.13 - samples/sec: 8139.04 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 23:59:22,126 epoch 9 - iter 203/292 - loss 0.30607357 - time (sec): 3.65 - samples/sec: 8231.07 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 23:59:22,674 epoch 9 - iter 232/292 - loss 0.31419985 - time (sec): 4.20 - samples/sec: 8426.15 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 23:59:23,183 epoch 9 - iter 261/292 - loss 0.31077284 - time (sec): 4.71 - samples/sec: 8428.15 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:59:23,693 epoch 9 - iter 290/292 - loss 0.31676368 - time (sec): 5.22 - samples/sec: 8487.29 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:59:23,722 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:23,722 EPOCH 9 done: loss 0.3177 - lr: 0.000006 |
|
2023-10-19 23:59:24,359 DEV : loss 0.30219903588294983 - f1-score (micro avg) 0.2727 |
|
2023-10-19 23:59:24,364 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:24,907 epoch 10 - iter 29/292 - loss 0.26410275 - time (sec): 0.54 - samples/sec: 9510.14 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:59:25,492 epoch 10 - iter 58/292 - loss 0.29335365 - time (sec): 1.13 - samples/sec: 9521.29 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:59:26,015 epoch 10 - iter 87/292 - loss 0.30417325 - time (sec): 1.65 - samples/sec: 8740.51 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 23:59:26,576 epoch 10 - iter 116/292 - loss 0.30083326 - time (sec): 2.21 - samples/sec: 8421.16 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:59:27,105 epoch 10 - iter 145/292 - loss 0.30274681 - time (sec): 2.74 - samples/sec: 8203.33 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:59:27,633 epoch 10 - iter 174/292 - loss 0.30387774 - time (sec): 3.27 - samples/sec: 8146.14 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 23:59:28,112 epoch 10 - iter 203/292 - loss 0.30407753 - time (sec): 3.75 - samples/sec: 8142.22 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 23:59:28,631 epoch 10 - iter 232/292 - loss 0.30962134 - time (sec): 4.27 - samples/sec: 8163.95 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 23:59:29,152 epoch 10 - iter 261/292 - loss 0.31082971 - time (sec): 4.79 - samples/sec: 8096.13 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 23:59:29,705 epoch 10 - iter 290/292 - loss 0.31152913 - time (sec): 5.34 - samples/sec: 8290.69 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 23:59:29,733 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:29,733 EPOCH 10 done: loss 0.3108 - lr: 0.000000 |
|
2023-10-19 23:59:30,397 DEV : loss 0.2987516224384308 - f1-score (micro avg) 0.2961 |
|
2023-10-19 23:59:30,401 saving best model |
|
2023-10-19 23:59:30,468 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:30,468 Loading model from best epoch ... |
|
2023-10-19 23:59:30,544 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-19 23:59:31,512 |
|
Results: |
|
- F-score (micro) 0.377 |
|
- F-score (macro) 0.1977 |
|
- Accuracy 0.243 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.4282 0.4368 0.4324 348 |
|
LOC 0.3185 0.4100 0.3585 261 |
|
ORG 0.0000 0.0000 0.0000 52 |
|
HumanProd 0.0000 0.0000 0.0000 22 |
|
|
|
micro avg 0.3748 0.3792 0.3770 683 |
|
macro avg 0.1867 0.2117 0.1977 683 |
|
weighted avg 0.3399 0.3792 0.3573 683 |
|
|
|
2023-10-19 23:59:31,512 ---------------------------------------------------------------------------------------------------- |
|
|