|
2023-10-20 00:10:01,865 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:01,865 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-20 00:10:01,866 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:01,866 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-20 00:10:01,866 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:01,866 Train: 1085 sentences |
|
2023-10-20 00:10:01,866 (train_with_dev=False, train_with_test=False) |
|
2023-10-20 00:10:01,866 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:01,866 Training Params: |
|
2023-10-20 00:10:01,866 - learning_rate: "5e-05" |
|
2023-10-20 00:10:01,866 - mini_batch_size: "4" |
|
2023-10-20 00:10:01,866 - max_epochs: "10" |
|
2023-10-20 00:10:01,866 - shuffle: "True" |
|
2023-10-20 00:10:01,866 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:01,866 Plugins: |
|
2023-10-20 00:10:01,866 - TensorboardLogger |
|
2023-10-20 00:10:01,866 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-20 00:10:01,866 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:01,866 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-20 00:10:01,866 - metric: "('micro avg', 'f1-score')" |
|
2023-10-20 00:10:01,866 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:01,866 Computation: |
|
2023-10-20 00:10:01,866 - compute on device: cuda:0 |
|
2023-10-20 00:10:01,866 - embedding storage: none |
|
2023-10-20 00:10:01,866 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:01,866 Model training base path: "hmbench-newseye/sv-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-20 00:10:01,866 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:01,866 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:01,867 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-20 00:10:02,379 epoch 1 - iter 27/272 - loss 3.27370034 - time (sec): 0.51 - samples/sec: 10928.06 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 00:10:02,881 epoch 1 - iter 54/272 - loss 3.17903863 - time (sec): 1.01 - samples/sec: 10349.91 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-20 00:10:03,381 epoch 1 - iter 81/272 - loss 3.03675988 - time (sec): 1.51 - samples/sec: 10667.36 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:10:03,827 epoch 1 - iter 108/272 - loss 2.83129903 - time (sec): 1.96 - samples/sec: 10660.83 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-20 00:10:04,268 epoch 1 - iter 135/272 - loss 2.62136473 - time (sec): 2.40 - samples/sec: 10574.98 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-20 00:10:04,747 epoch 1 - iter 162/272 - loss 2.33357166 - time (sec): 2.88 - samples/sec: 10810.83 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-20 00:10:05,198 epoch 1 - iter 189/272 - loss 2.15035600 - time (sec): 3.33 - samples/sec: 10730.77 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-20 00:10:05,694 epoch 1 - iter 216/272 - loss 1.94404664 - time (sec): 3.83 - samples/sec: 10777.68 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-20 00:10:06,158 epoch 1 - iter 243/272 - loss 1.80696099 - time (sec): 4.29 - samples/sec: 10804.88 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-20 00:10:06,681 epoch 1 - iter 270/272 - loss 1.68559515 - time (sec): 4.81 - samples/sec: 10764.05 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-20 00:10:06,710 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:06,710 EPOCH 1 done: loss 1.6836 - lr: 0.000049 |
|
2023-10-20 00:10:06,979 DEV : loss 0.47485587000846863 - f1-score (micro avg) 0.0 |
|
2023-10-20 00:10:06,983 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:07,504 epoch 2 - iter 27/272 - loss 0.71854839 - time (sec): 0.52 - samples/sec: 10563.71 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-20 00:10:08,029 epoch 2 - iter 54/272 - loss 0.69126901 - time (sec): 1.05 - samples/sec: 10961.74 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-20 00:10:08,524 epoch 2 - iter 81/272 - loss 0.69336256 - time (sec): 1.54 - samples/sec: 10565.69 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-20 00:10:09,042 epoch 2 - iter 108/272 - loss 0.68028598 - time (sec): 2.06 - samples/sec: 10578.73 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-20 00:10:09,569 epoch 2 - iter 135/272 - loss 0.65069861 - time (sec): 2.59 - samples/sec: 10767.83 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-20 00:10:10,032 epoch 2 - iter 162/272 - loss 0.62575079 - time (sec): 3.05 - samples/sec: 10542.90 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-20 00:10:10,535 epoch 2 - iter 189/272 - loss 0.60004979 - time (sec): 3.55 - samples/sec: 10434.51 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-20 00:10:11,037 epoch 2 - iter 216/272 - loss 0.59524114 - time (sec): 4.05 - samples/sec: 10334.91 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-20 00:10:11,546 epoch 2 - iter 243/272 - loss 0.58323312 - time (sec): 4.56 - samples/sec: 10241.49 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-20 00:10:12,008 epoch 2 - iter 270/272 - loss 0.56554405 - time (sec): 5.03 - samples/sec: 10283.98 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-20 00:10:12,042 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:12,042 EPOCH 2 done: loss 0.5664 - lr: 0.000045 |
|
2023-10-20 00:10:12,800 DEV : loss 0.37471088767051697 - f1-score (micro avg) 0.0 |
|
2023-10-20 00:10:12,805 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:13,321 epoch 3 - iter 27/272 - loss 0.50465870 - time (sec): 0.52 - samples/sec: 9483.24 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-20 00:10:13,863 epoch 3 - iter 54/272 - loss 0.48630798 - time (sec): 1.06 - samples/sec: 9472.20 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-20 00:10:14,373 epoch 3 - iter 81/272 - loss 0.46931093 - time (sec): 1.57 - samples/sec: 10238.28 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-20 00:10:14,834 epoch 3 - iter 108/272 - loss 0.45588380 - time (sec): 2.03 - samples/sec: 10557.30 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-20 00:10:15,459 epoch 3 - iter 135/272 - loss 0.45732524 - time (sec): 2.65 - samples/sec: 9906.58 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-20 00:10:15,898 epoch 3 - iter 162/272 - loss 0.45347777 - time (sec): 3.09 - samples/sec: 10045.11 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-20 00:10:16,342 epoch 3 - iter 189/272 - loss 0.45441373 - time (sec): 3.54 - samples/sec: 9992.82 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-20 00:10:16,797 epoch 3 - iter 216/272 - loss 0.45094356 - time (sec): 3.99 - samples/sec: 10134.42 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-20 00:10:17,287 epoch 3 - iter 243/272 - loss 0.44316849 - time (sec): 4.48 - samples/sec: 10179.48 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-20 00:10:17,774 epoch 3 - iter 270/272 - loss 0.43558974 - time (sec): 4.97 - samples/sec: 10432.51 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-20 00:10:17,801 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:17,801 EPOCH 3 done: loss 0.4352 - lr: 0.000039 |
|
2023-10-20 00:10:18,558 DEV : loss 0.2934219241142273 - f1-score (micro avg) 0.2097 |
|
2023-10-20 00:10:18,562 saving best model |
|
2023-10-20 00:10:18,589 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:19,077 epoch 4 - iter 27/272 - loss 0.34274907 - time (sec): 0.49 - samples/sec: 10600.34 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-20 00:10:19,548 epoch 4 - iter 54/272 - loss 0.38585840 - time (sec): 0.96 - samples/sec: 10938.94 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-20 00:10:20,038 epoch 4 - iter 81/272 - loss 0.41785414 - time (sec): 1.45 - samples/sec: 11044.43 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-20 00:10:20,485 epoch 4 - iter 108/272 - loss 0.40897222 - time (sec): 1.90 - samples/sec: 11022.39 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-20 00:10:20,950 epoch 4 - iter 135/272 - loss 0.38196432 - time (sec): 2.36 - samples/sec: 11118.53 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-20 00:10:21,402 epoch 4 - iter 162/272 - loss 0.37036420 - time (sec): 2.81 - samples/sec: 10991.66 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-20 00:10:21,870 epoch 4 - iter 189/272 - loss 0.38038372 - time (sec): 3.28 - samples/sec: 11153.47 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-20 00:10:22,302 epoch 4 - iter 216/272 - loss 0.37697104 - time (sec): 3.71 - samples/sec: 11153.61 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-20 00:10:22,766 epoch 4 - iter 243/272 - loss 0.37353457 - time (sec): 4.18 - samples/sec: 11184.82 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-20 00:10:23,219 epoch 4 - iter 270/272 - loss 0.38186740 - time (sec): 4.63 - samples/sec: 11185.00 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-20 00:10:23,245 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:23,245 EPOCH 4 done: loss 0.3816 - lr: 0.000033 |
|
2023-10-20 00:10:24,005 DEV : loss 0.2845022976398468 - f1-score (micro avg) 0.3411 |
|
2023-10-20 00:10:24,009 saving best model |
|
2023-10-20 00:10:24,041 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:24,551 epoch 5 - iter 27/272 - loss 0.31416516 - time (sec): 0.51 - samples/sec: 11673.42 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-20 00:10:25,066 epoch 5 - iter 54/272 - loss 0.34632047 - time (sec): 1.02 - samples/sec: 11559.14 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-20 00:10:25,589 epoch 5 - iter 81/272 - loss 0.34140409 - time (sec): 1.55 - samples/sec: 11346.81 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-20 00:10:26,078 epoch 5 - iter 108/272 - loss 0.34931667 - time (sec): 2.04 - samples/sec: 10850.34 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-20 00:10:26,573 epoch 5 - iter 135/272 - loss 0.36838290 - time (sec): 2.53 - samples/sec: 10738.64 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-20 00:10:27,076 epoch 5 - iter 162/272 - loss 0.35263557 - time (sec): 3.03 - samples/sec: 10855.75 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-20 00:10:27,592 epoch 5 - iter 189/272 - loss 0.34759117 - time (sec): 3.55 - samples/sec: 10714.03 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-20 00:10:28,066 epoch 5 - iter 216/272 - loss 0.34635807 - time (sec): 4.02 - samples/sec: 10493.00 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-20 00:10:28,552 epoch 5 - iter 243/272 - loss 0.34919162 - time (sec): 4.51 - samples/sec: 10458.26 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 00:10:29,031 epoch 5 - iter 270/272 - loss 0.34703800 - time (sec): 4.99 - samples/sec: 10371.11 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 00:10:29,062 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:29,062 EPOCH 5 done: loss 0.3459 - lr: 0.000028 |
|
2023-10-20 00:10:29,813 DEV : loss 0.2660656273365021 - f1-score (micro avg) 0.4177 |
|
2023-10-20 00:10:29,817 saving best model |
|
2023-10-20 00:10:29,850 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:30,362 epoch 6 - iter 27/272 - loss 0.33120098 - time (sec): 0.51 - samples/sec: 9847.18 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-20 00:10:30,859 epoch 6 - iter 54/272 - loss 0.30382872 - time (sec): 1.01 - samples/sec: 10381.06 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-20 00:10:31,358 epoch 6 - iter 81/272 - loss 0.30565634 - time (sec): 1.51 - samples/sec: 10442.74 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 00:10:31,882 epoch 6 - iter 108/272 - loss 0.33247609 - time (sec): 2.03 - samples/sec: 10860.62 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 00:10:32,392 epoch 6 - iter 135/272 - loss 0.32788335 - time (sec): 2.54 - samples/sec: 10729.03 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-20 00:10:32,846 epoch 6 - iter 162/272 - loss 0.32789529 - time (sec): 3.00 - samples/sec: 10587.69 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:10:33,367 epoch 6 - iter 189/272 - loss 0.33278159 - time (sec): 3.52 - samples/sec: 10501.36 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:10:33,869 epoch 6 - iter 216/272 - loss 0.32579663 - time (sec): 4.02 - samples/sec: 10524.59 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-20 00:10:34,364 epoch 6 - iter 243/272 - loss 0.32305445 - time (sec): 4.51 - samples/sec: 10491.65 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-20 00:10:34,831 epoch 6 - iter 270/272 - loss 0.32336884 - time (sec): 4.98 - samples/sec: 10419.07 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-20 00:10:34,857 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:34,857 EPOCH 6 done: loss 0.3229 - lr: 0.000022 |
|
2023-10-20 00:10:35,629 DEV : loss 0.25167223811149597 - f1-score (micro avg) 0.4637 |
|
2023-10-20 00:10:35,632 saving best model |
|
2023-10-20 00:10:35,666 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:36,187 epoch 7 - iter 27/272 - loss 0.40516393 - time (sec): 0.52 - samples/sec: 10828.35 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-20 00:10:36,726 epoch 7 - iter 54/272 - loss 0.33504991 - time (sec): 1.06 - samples/sec: 10997.35 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 00:10:37,225 epoch 7 - iter 81/272 - loss 0.31111585 - time (sec): 1.56 - samples/sec: 10971.57 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 00:10:37,689 epoch 7 - iter 108/272 - loss 0.31209626 - time (sec): 2.02 - samples/sec: 10531.08 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-20 00:10:38,190 epoch 7 - iter 135/272 - loss 0.30132911 - time (sec): 2.52 - samples/sec: 10468.80 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-20 00:10:38,689 epoch 7 - iter 162/272 - loss 0.30901869 - time (sec): 3.02 - samples/sec: 10352.48 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-20 00:10:39,212 epoch 7 - iter 189/272 - loss 0.31738761 - time (sec): 3.55 - samples/sec: 10349.55 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 00:10:39,725 epoch 7 - iter 216/272 - loss 0.31026399 - time (sec): 4.06 - samples/sec: 10154.23 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 00:10:40,258 epoch 7 - iter 243/272 - loss 0.31390048 - time (sec): 4.59 - samples/sec: 10250.46 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-20 00:10:40,763 epoch 7 - iter 270/272 - loss 0.31141067 - time (sec): 5.10 - samples/sec: 10155.93 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-20 00:10:40,797 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:40,797 EPOCH 7 done: loss 0.3113 - lr: 0.000017 |
|
2023-10-20 00:10:41,584 DEV : loss 0.24986842274665833 - f1-score (micro avg) 0.4673 |
|
2023-10-20 00:10:41,588 saving best model |
|
2023-10-20 00:10:41,621 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:42,151 epoch 8 - iter 27/272 - loss 0.21596682 - time (sec): 0.53 - samples/sec: 10324.40 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-20 00:10:42,704 epoch 8 - iter 54/272 - loss 0.25341462 - time (sec): 1.08 - samples/sec: 9899.98 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-20 00:10:43,230 epoch 8 - iter 81/272 - loss 0.27916670 - time (sec): 1.61 - samples/sec: 9787.12 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:10:43,790 epoch 8 - iter 108/272 - loss 0.31225710 - time (sec): 2.17 - samples/sec: 9838.49 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:10:44,441 epoch 8 - iter 135/272 - loss 0.30190443 - time (sec): 2.82 - samples/sec: 9574.76 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:10:44,980 epoch 8 - iter 162/272 - loss 0.29499440 - time (sec): 3.36 - samples/sec: 9532.85 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-20 00:10:45,536 epoch 8 - iter 189/272 - loss 0.29255034 - time (sec): 3.91 - samples/sec: 9549.95 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-20 00:10:46,020 epoch 8 - iter 216/272 - loss 0.29419704 - time (sec): 4.40 - samples/sec: 9476.05 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:10:46,489 epoch 8 - iter 243/272 - loss 0.29641175 - time (sec): 4.87 - samples/sec: 9471.44 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:10:47,006 epoch 8 - iter 270/272 - loss 0.29820061 - time (sec): 5.38 - samples/sec: 9579.52 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:10:47,044 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:47,045 EPOCH 8 done: loss 0.2978 - lr: 0.000011 |
|
2023-10-20 00:10:47,812 DEV : loss 0.2413243055343628 - f1-score (micro avg) 0.4778 |
|
2023-10-20 00:10:47,816 saving best model |
|
2023-10-20 00:10:47,848 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:48,335 epoch 9 - iter 27/272 - loss 0.30738887 - time (sec): 0.49 - samples/sec: 10001.71 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:10:48,837 epoch 9 - iter 54/272 - loss 0.28206474 - time (sec): 0.99 - samples/sec: 9975.60 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-20 00:10:49,367 epoch 9 - iter 81/272 - loss 0.28407417 - time (sec): 1.52 - samples/sec: 10336.58 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:10:49,854 epoch 9 - iter 108/272 - loss 0.30604991 - time (sec): 2.01 - samples/sec: 10319.44 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:10:50,353 epoch 9 - iter 135/272 - loss 0.29423229 - time (sec): 2.50 - samples/sec: 10262.21 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:10:50,847 epoch 9 - iter 162/272 - loss 0.30151775 - time (sec): 3.00 - samples/sec: 10223.80 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:10:51,348 epoch 9 - iter 189/272 - loss 0.29419587 - time (sec): 3.50 - samples/sec: 10128.05 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-20 00:10:51,848 epoch 9 - iter 216/272 - loss 0.29606309 - time (sec): 4.00 - samples/sec: 10152.35 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-20 00:10:52,373 epoch 9 - iter 243/272 - loss 0.29534319 - time (sec): 4.52 - samples/sec: 10314.68 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:10:52,882 epoch 9 - iter 270/272 - loss 0.29063737 - time (sec): 5.03 - samples/sec: 10307.02 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:10:52,911 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:52,911 EPOCH 9 done: loss 0.2906 - lr: 0.000006 |
|
2023-10-20 00:10:53,695 DEV : loss 0.2407713383436203 - f1-score (micro avg) 0.4847 |
|
2023-10-20 00:10:53,699 saving best model |
|
2023-10-20 00:10:53,730 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:54,262 epoch 10 - iter 27/272 - loss 0.26292708 - time (sec): 0.53 - samples/sec: 10922.93 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 00:10:54,737 epoch 10 - iter 54/272 - loss 0.30143339 - time (sec): 1.01 - samples/sec: 9895.19 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:10:55,241 epoch 10 - iter 81/272 - loss 0.28174577 - time (sec): 1.51 - samples/sec: 9911.78 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:10:55,788 epoch 10 - iter 108/272 - loss 0.29077417 - time (sec): 2.06 - samples/sec: 9840.15 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:10:56,379 epoch 10 - iter 135/272 - loss 0.29242246 - time (sec): 2.65 - samples/sec: 9882.17 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:10:56,922 epoch 10 - iter 162/272 - loss 0.29184925 - time (sec): 3.19 - samples/sec: 9737.74 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 00:10:57,415 epoch 10 - iter 189/272 - loss 0.28160903 - time (sec): 3.68 - samples/sec: 9886.93 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 00:10:57,922 epoch 10 - iter 216/272 - loss 0.29090535 - time (sec): 4.19 - samples/sec: 10019.03 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 00:10:58,409 epoch 10 - iter 243/272 - loss 0.28667582 - time (sec): 4.68 - samples/sec: 9938.18 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 00:10:58,920 epoch 10 - iter 270/272 - loss 0.28590550 - time (sec): 5.19 - samples/sec: 9980.62 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-20 00:10:58,949 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:58,949 EPOCH 10 done: loss 0.2864 - lr: 0.000000 |
|
2023-10-20 00:10:59,722 DEV : loss 0.2398698478937149 - f1-score (micro avg) 0.4836 |
|
2023-10-20 00:10:59,754 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:10:59,755 Loading model from best epoch ... |
|
2023-10-20 00:10:59,829 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-20 00:11:00,637 |
|
Results: |
|
- F-score (micro) 0.4062 |
|
- F-score (macro) 0.2043 |
|
- Accuracy 0.2646 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.5027 0.5962 0.5455 312 |
|
PER 0.2461 0.3029 0.2716 208 |
|
ORG 0.0000 0.0000 0.0000 55 |
|
HumanProd 0.0000 0.0000 0.0000 22 |
|
|
|
micro avg 0.3959 0.4171 0.4062 597 |
|
macro avg 0.1872 0.2248 0.2043 597 |
|
weighted avg 0.3485 0.4171 0.3797 597 |
|
|
|
2023-10-20 00:11:00,637 ---------------------------------------------------------------------------------------------------- |
|
|