stefan-it's picture
Upload folder using huggingface_hub
4c204f4
2023-10-16 18:03:29,369 ----------------------------------------------------------------------------------------------------
2023-10-16 18:03:29,369 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 18:03:29,370 ----------------------------------------------------------------------------------------------------
2023-10-16 18:03:29,370 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-16 18:03:29,370 ----------------------------------------------------------------------------------------------------
2023-10-16 18:03:29,370 Train: 1166 sentences
2023-10-16 18:03:29,370 (train_with_dev=False, train_with_test=False)
2023-10-16 18:03:29,370 ----------------------------------------------------------------------------------------------------
2023-10-16 18:03:29,370 Training Params:
2023-10-16 18:03:29,370 - learning_rate: "5e-05"
2023-10-16 18:03:29,370 - mini_batch_size: "8"
2023-10-16 18:03:29,370 - max_epochs: "10"
2023-10-16 18:03:29,370 - shuffle: "True"
2023-10-16 18:03:29,370 ----------------------------------------------------------------------------------------------------
2023-10-16 18:03:29,370 Plugins:
2023-10-16 18:03:29,370 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 18:03:29,370 ----------------------------------------------------------------------------------------------------
2023-10-16 18:03:29,370 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 18:03:29,371 - metric: "('micro avg', 'f1-score')"
2023-10-16 18:03:29,371 ----------------------------------------------------------------------------------------------------
2023-10-16 18:03:29,371 Computation:
2023-10-16 18:03:29,371 - compute on device: cuda:0
2023-10-16 18:03:29,371 - embedding storage: none
2023-10-16 18:03:29,371 ----------------------------------------------------------------------------------------------------
2023-10-16 18:03:29,371 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-16 18:03:29,371 ----------------------------------------------------------------------------------------------------
2023-10-16 18:03:29,371 ----------------------------------------------------------------------------------------------------
2023-10-16 18:03:30,832 epoch 1 - iter 14/146 - loss 3.01497338 - time (sec): 1.46 - samples/sec: 2840.69 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:03:32,586 epoch 1 - iter 28/146 - loss 2.66432612 - time (sec): 3.21 - samples/sec: 2976.51 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:03:33,881 epoch 1 - iter 42/146 - loss 2.22017613 - time (sec): 4.51 - samples/sec: 2961.61 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:03:35,066 epoch 1 - iter 56/146 - loss 1.88824002 - time (sec): 5.69 - samples/sec: 3029.64 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:03:36,628 epoch 1 - iter 70/146 - loss 1.59845017 - time (sec): 7.26 - samples/sec: 3000.46 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:03:38,129 epoch 1 - iter 84/146 - loss 1.41561660 - time (sec): 8.76 - samples/sec: 2977.65 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:03:39,479 epoch 1 - iter 98/146 - loss 1.31091327 - time (sec): 10.11 - samples/sec: 3005.20 - lr: 0.000033 - momentum: 0.000000
2023-10-16 18:03:40,770 epoch 1 - iter 112/146 - loss 1.19566679 - time (sec): 11.40 - samples/sec: 3019.72 - lr: 0.000038 - momentum: 0.000000
2023-10-16 18:03:42,096 epoch 1 - iter 126/146 - loss 1.10677521 - time (sec): 12.72 - samples/sec: 2998.40 - lr: 0.000043 - momentum: 0.000000
2023-10-16 18:03:43,586 epoch 1 - iter 140/146 - loss 1.01852525 - time (sec): 14.21 - samples/sec: 3008.21 - lr: 0.000048 - momentum: 0.000000
2023-10-16 18:03:44,182 ----------------------------------------------------------------------------------------------------
2023-10-16 18:03:44,183 EPOCH 1 done: loss 0.9945 - lr: 0.000048
2023-10-16 18:03:45,011 DEV : loss 0.21267659962177277 - f1-score (micro avg) 0.4215
2023-10-16 18:03:45,017 saving best model
2023-10-16 18:03:45,421 ----------------------------------------------------------------------------------------------------
2023-10-16 18:03:47,141 epoch 2 - iter 14/146 - loss 0.19846158 - time (sec): 1.72 - samples/sec: 2526.85 - lr: 0.000050 - momentum: 0.000000
2023-10-16 18:03:48,426 epoch 2 - iter 28/146 - loss 0.25337461 - time (sec): 3.00 - samples/sec: 2817.46 - lr: 0.000049 - momentum: 0.000000
2023-10-16 18:03:49,670 epoch 2 - iter 42/146 - loss 0.24819521 - time (sec): 4.25 - samples/sec: 3003.01 - lr: 0.000048 - momentum: 0.000000
2023-10-16 18:03:51,044 epoch 2 - iter 56/146 - loss 0.22898805 - time (sec): 5.62 - samples/sec: 2998.40 - lr: 0.000048 - momentum: 0.000000
2023-10-16 18:03:52,231 epoch 2 - iter 70/146 - loss 0.21737529 - time (sec): 6.81 - samples/sec: 2987.47 - lr: 0.000047 - momentum: 0.000000
2023-10-16 18:03:53,641 epoch 2 - iter 84/146 - loss 0.21818368 - time (sec): 8.22 - samples/sec: 2953.87 - lr: 0.000047 - momentum: 0.000000
2023-10-16 18:03:55,531 epoch 2 - iter 98/146 - loss 0.21871720 - time (sec): 10.11 - samples/sec: 2914.81 - lr: 0.000046 - momentum: 0.000000
2023-10-16 18:03:57,179 epoch 2 - iter 112/146 - loss 0.21468634 - time (sec): 11.76 - samples/sec: 2886.73 - lr: 0.000046 - momentum: 0.000000
2023-10-16 18:03:58,761 epoch 2 - iter 126/146 - loss 0.21766037 - time (sec): 13.34 - samples/sec: 2867.10 - lr: 0.000045 - momentum: 0.000000
2023-10-16 18:04:00,342 epoch 2 - iter 140/146 - loss 0.20901073 - time (sec): 14.92 - samples/sec: 2891.15 - lr: 0.000045 - momentum: 0.000000
2023-10-16 18:04:00,765 ----------------------------------------------------------------------------------------------------
2023-10-16 18:04:00,765 EPOCH 2 done: loss 0.2086 - lr: 0.000045
2023-10-16 18:04:02,072 DEV : loss 0.1228543296456337 - f1-score (micro avg) 0.6842
2023-10-16 18:04:02,078 saving best model
2023-10-16 18:04:02,611 ----------------------------------------------------------------------------------------------------
2023-10-16 18:04:04,294 epoch 3 - iter 14/146 - loss 0.14117504 - time (sec): 1.68 - samples/sec: 3205.04 - lr: 0.000044 - momentum: 0.000000
2023-10-16 18:04:05,541 epoch 3 - iter 28/146 - loss 0.12690368 - time (sec): 2.93 - samples/sec: 2907.39 - lr: 0.000043 - momentum: 0.000000
2023-10-16 18:04:06,769 epoch 3 - iter 42/146 - loss 0.13116279 - time (sec): 4.16 - samples/sec: 3099.71 - lr: 0.000043 - momentum: 0.000000
2023-10-16 18:04:08,164 epoch 3 - iter 56/146 - loss 0.12578848 - time (sec): 5.55 - samples/sec: 3162.13 - lr: 0.000042 - momentum: 0.000000
2023-10-16 18:04:09,644 epoch 3 - iter 70/146 - loss 0.11850282 - time (sec): 7.03 - samples/sec: 3196.95 - lr: 0.000042 - momentum: 0.000000
2023-10-16 18:04:11,025 epoch 3 - iter 84/146 - loss 0.11666530 - time (sec): 8.41 - samples/sec: 3173.50 - lr: 0.000041 - momentum: 0.000000
2023-10-16 18:04:12,066 epoch 3 - iter 98/146 - loss 0.11535129 - time (sec): 9.45 - samples/sec: 3129.27 - lr: 0.000041 - momentum: 0.000000
2023-10-16 18:04:13,501 epoch 3 - iter 112/146 - loss 0.11610264 - time (sec): 10.89 - samples/sec: 3096.61 - lr: 0.000040 - momentum: 0.000000
2023-10-16 18:04:14,912 epoch 3 - iter 126/146 - loss 0.11279962 - time (sec): 12.30 - samples/sec: 3085.89 - lr: 0.000040 - momentum: 0.000000
2023-10-16 18:04:16,435 epoch 3 - iter 140/146 - loss 0.11327376 - time (sec): 13.82 - samples/sec: 3090.83 - lr: 0.000039 - momentum: 0.000000
2023-10-16 18:04:17,059 ----------------------------------------------------------------------------------------------------
2023-10-16 18:04:17,059 EPOCH 3 done: loss 0.1124 - lr: 0.000039
2023-10-16 18:04:18,296 DEV : loss 0.11312269419431686 - f1-score (micro avg) 0.7093
2023-10-16 18:04:18,300 saving best model
2023-10-16 18:04:18,736 ----------------------------------------------------------------------------------------------------
2023-10-16 18:04:20,303 epoch 4 - iter 14/146 - loss 0.07271600 - time (sec): 1.56 - samples/sec: 2859.63 - lr: 0.000038 - momentum: 0.000000
2023-10-16 18:04:21,947 epoch 4 - iter 28/146 - loss 0.08092845 - time (sec): 3.21 - samples/sec: 2813.11 - lr: 0.000038 - momentum: 0.000000
2023-10-16 18:04:23,283 epoch 4 - iter 42/146 - loss 0.07155158 - time (sec): 4.54 - samples/sec: 2848.06 - lr: 0.000037 - momentum: 0.000000
2023-10-16 18:04:24,679 epoch 4 - iter 56/146 - loss 0.06922215 - time (sec): 5.94 - samples/sec: 2916.61 - lr: 0.000037 - momentum: 0.000000
2023-10-16 18:04:26,059 epoch 4 - iter 70/146 - loss 0.07262218 - time (sec): 7.32 - samples/sec: 2935.48 - lr: 0.000036 - momentum: 0.000000
2023-10-16 18:04:27,517 epoch 4 - iter 84/146 - loss 0.07370568 - time (sec): 8.78 - samples/sec: 2884.56 - lr: 0.000036 - momentum: 0.000000
2023-10-16 18:04:28,799 epoch 4 - iter 98/146 - loss 0.07320739 - time (sec): 10.06 - samples/sec: 2889.11 - lr: 0.000035 - momentum: 0.000000
2023-10-16 18:04:30,342 epoch 4 - iter 112/146 - loss 0.07873008 - time (sec): 11.60 - samples/sec: 2889.88 - lr: 0.000035 - momentum: 0.000000
2023-10-16 18:04:31,904 epoch 4 - iter 126/146 - loss 0.07512300 - time (sec): 13.16 - samples/sec: 2927.05 - lr: 0.000034 - momentum: 0.000000
2023-10-16 18:04:33,248 epoch 4 - iter 140/146 - loss 0.07444114 - time (sec): 14.51 - samples/sec: 2929.89 - lr: 0.000034 - momentum: 0.000000
2023-10-16 18:04:33,902 ----------------------------------------------------------------------------------------------------
2023-10-16 18:04:33,902 EPOCH 4 done: loss 0.0751 - lr: 0.000034
2023-10-16 18:04:35,153 DEV : loss 0.10812485218048096 - f1-score (micro avg) 0.7446
2023-10-16 18:04:35,158 saving best model
2023-10-16 18:04:35,666 ----------------------------------------------------------------------------------------------------
2023-10-16 18:04:37,282 epoch 5 - iter 14/146 - loss 0.05012667 - time (sec): 1.61 - samples/sec: 2978.46 - lr: 0.000033 - momentum: 0.000000
2023-10-16 18:04:38,724 epoch 5 - iter 28/146 - loss 0.05362344 - time (sec): 3.05 - samples/sec: 2896.15 - lr: 0.000032 - momentum: 0.000000
2023-10-16 18:04:39,994 epoch 5 - iter 42/146 - loss 0.05602592 - time (sec): 4.32 - samples/sec: 3049.46 - lr: 0.000032 - momentum: 0.000000
2023-10-16 18:04:41,436 epoch 5 - iter 56/146 - loss 0.05228982 - time (sec): 5.76 - samples/sec: 3067.34 - lr: 0.000031 - momentum: 0.000000
2023-10-16 18:04:43,165 epoch 5 - iter 70/146 - loss 0.05023073 - time (sec): 7.49 - samples/sec: 2978.39 - lr: 0.000031 - momentum: 0.000000
2023-10-16 18:04:44,560 epoch 5 - iter 84/146 - loss 0.05370307 - time (sec): 8.89 - samples/sec: 2991.28 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:04:45,946 epoch 5 - iter 98/146 - loss 0.05217695 - time (sec): 10.28 - samples/sec: 3019.20 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:04:47,101 epoch 5 - iter 112/146 - loss 0.05378885 - time (sec): 11.43 - samples/sec: 3022.07 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:04:48,507 epoch 5 - iter 126/146 - loss 0.05094919 - time (sec): 12.84 - samples/sec: 3017.31 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:04:49,841 epoch 5 - iter 140/146 - loss 0.05093383 - time (sec): 14.17 - samples/sec: 3019.63 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:04:50,475 ----------------------------------------------------------------------------------------------------
2023-10-16 18:04:50,475 EPOCH 5 done: loss 0.0505 - lr: 0.000028
2023-10-16 18:04:51,717 DEV : loss 0.12893062829971313 - f1-score (micro avg) 0.7451
2023-10-16 18:04:51,722 saving best model
2023-10-16 18:04:52,219 ----------------------------------------------------------------------------------------------------
2023-10-16 18:04:53,686 epoch 6 - iter 14/146 - loss 0.05350656 - time (sec): 1.46 - samples/sec: 2785.60 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:04:55,172 epoch 6 - iter 28/146 - loss 0.04160296 - time (sec): 2.95 - samples/sec: 2864.73 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:04:56,281 epoch 6 - iter 42/146 - loss 0.04397600 - time (sec): 4.06 - samples/sec: 2902.44 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:04:57,844 epoch 6 - iter 56/146 - loss 0.04078383 - time (sec): 5.62 - samples/sec: 2960.99 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:04:58,952 epoch 6 - iter 70/146 - loss 0.03860104 - time (sec): 6.73 - samples/sec: 2941.14 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:05:00,612 epoch 6 - iter 84/146 - loss 0.03725238 - time (sec): 8.39 - samples/sec: 2888.25 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:05:02,296 epoch 6 - iter 98/146 - loss 0.03930290 - time (sec): 10.07 - samples/sec: 2919.23 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:05:03,668 epoch 6 - iter 112/146 - loss 0.03821762 - time (sec): 11.44 - samples/sec: 2953.46 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:05:05,152 epoch 6 - iter 126/146 - loss 0.03812065 - time (sec): 12.93 - samples/sec: 2955.70 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:05:06,486 epoch 6 - iter 140/146 - loss 0.03699667 - time (sec): 14.26 - samples/sec: 2988.15 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:05:07,036 ----------------------------------------------------------------------------------------------------
2023-10-16 18:05:07,036 EPOCH 6 done: loss 0.0362 - lr: 0.000023
2023-10-16 18:05:08,266 DEV : loss 0.12390300631523132 - f1-score (micro avg) 0.7414
2023-10-16 18:05:08,271 ----------------------------------------------------------------------------------------------------
2023-10-16 18:05:09,553 epoch 7 - iter 14/146 - loss 0.01743334 - time (sec): 1.28 - samples/sec: 3239.48 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:05:10,816 epoch 7 - iter 28/146 - loss 0.01770293 - time (sec): 2.54 - samples/sec: 3207.44 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:05:12,064 epoch 7 - iter 42/146 - loss 0.03132041 - time (sec): 3.79 - samples/sec: 3161.58 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:05:13,842 epoch 7 - iter 56/146 - loss 0.03031072 - time (sec): 5.57 - samples/sec: 3063.29 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:05:15,512 epoch 7 - iter 70/146 - loss 0.03222962 - time (sec): 7.24 - samples/sec: 2966.48 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:05:17,073 epoch 7 - iter 84/146 - loss 0.03034968 - time (sec): 8.80 - samples/sec: 2895.56 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:05:18,505 epoch 7 - iter 98/146 - loss 0.02880115 - time (sec): 10.23 - samples/sec: 2895.60 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:05:19,805 epoch 7 - iter 112/146 - loss 0.02780285 - time (sec): 11.53 - samples/sec: 2939.47 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:05:21,415 epoch 7 - iter 126/146 - loss 0.02885371 - time (sec): 13.14 - samples/sec: 2937.77 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:05:22,691 epoch 7 - iter 140/146 - loss 0.02990561 - time (sec): 14.42 - samples/sec: 2952.74 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:05:23,307 ----------------------------------------------------------------------------------------------------
2023-10-16 18:05:23,307 EPOCH 7 done: loss 0.0292 - lr: 0.000017
2023-10-16 18:05:24,837 DEV : loss 0.13431765139102936 - f1-score (micro avg) 0.7837
2023-10-16 18:05:24,843 saving best model
2023-10-16 18:05:25,379 ----------------------------------------------------------------------------------------------------
2023-10-16 18:05:26,644 epoch 8 - iter 14/146 - loss 0.01998228 - time (sec): 1.26 - samples/sec: 2827.62 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:05:28,217 epoch 8 - iter 28/146 - loss 0.01806968 - time (sec): 2.83 - samples/sec: 2953.41 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:05:29,581 epoch 8 - iter 42/146 - loss 0.01593014 - time (sec): 4.20 - samples/sec: 2977.35 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:05:31,066 epoch 8 - iter 56/146 - loss 0.01649775 - time (sec): 5.68 - samples/sec: 2983.75 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:05:32,577 epoch 8 - iter 70/146 - loss 0.01690638 - time (sec): 7.19 - samples/sec: 2953.54 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:05:34,016 epoch 8 - iter 84/146 - loss 0.01728427 - time (sec): 8.63 - samples/sec: 2931.99 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:05:35,351 epoch 8 - iter 98/146 - loss 0.01793460 - time (sec): 9.97 - samples/sec: 2912.50 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:05:36,764 epoch 8 - iter 112/146 - loss 0.02025946 - time (sec): 11.38 - samples/sec: 2919.80 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:05:37,985 epoch 8 - iter 126/146 - loss 0.02049963 - time (sec): 12.60 - samples/sec: 2942.14 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:05:39,695 epoch 8 - iter 140/146 - loss 0.02027892 - time (sec): 14.31 - samples/sec: 2967.94 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:05:40,439 ----------------------------------------------------------------------------------------------------
2023-10-16 18:05:40,439 EPOCH 8 done: loss 0.0198 - lr: 0.000012
2023-10-16 18:05:41,765 DEV : loss 0.15437102317810059 - f1-score (micro avg) 0.7479
2023-10-16 18:05:41,774 ----------------------------------------------------------------------------------------------------
2023-10-16 18:05:43,312 epoch 9 - iter 14/146 - loss 0.01195428 - time (sec): 1.54 - samples/sec: 2860.41 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:05:44,705 epoch 9 - iter 28/146 - loss 0.00962441 - time (sec): 2.93 - samples/sec: 2836.03 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:05:46,047 epoch 9 - iter 42/146 - loss 0.01059304 - time (sec): 4.27 - samples/sec: 2842.07 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:05:47,676 epoch 9 - iter 56/146 - loss 0.01847025 - time (sec): 5.90 - samples/sec: 2896.60 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:05:49,125 epoch 9 - iter 70/146 - loss 0.01586688 - time (sec): 7.35 - samples/sec: 2895.07 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:05:50,344 epoch 9 - iter 84/146 - loss 0.01537808 - time (sec): 8.57 - samples/sec: 2959.60 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:05:51,778 epoch 9 - iter 98/146 - loss 0.01688221 - time (sec): 10.00 - samples/sec: 2940.78 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:05:53,187 epoch 9 - iter 112/146 - loss 0.01695533 - time (sec): 11.41 - samples/sec: 2952.98 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:05:54,787 epoch 9 - iter 126/146 - loss 0.01675493 - time (sec): 13.01 - samples/sec: 2933.72 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:05:56,392 epoch 9 - iter 140/146 - loss 0.01614276 - time (sec): 14.62 - samples/sec: 2926.73 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:05:56,937 ----------------------------------------------------------------------------------------------------
2023-10-16 18:05:56,938 EPOCH 9 done: loss 0.0156 - lr: 0.000006
2023-10-16 18:05:58,185 DEV : loss 0.15635186433792114 - f1-score (micro avg) 0.7368
2023-10-16 18:05:58,190 ----------------------------------------------------------------------------------------------------
2023-10-16 18:05:59,477 epoch 10 - iter 14/146 - loss 0.00619091 - time (sec): 1.29 - samples/sec: 2928.89 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:06:01,000 epoch 10 - iter 28/146 - loss 0.00503078 - time (sec): 2.81 - samples/sec: 3186.76 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:06:02,933 epoch 10 - iter 42/146 - loss 0.01384968 - time (sec): 4.74 - samples/sec: 3059.21 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:06:04,330 epoch 10 - iter 56/146 - loss 0.01118529 - time (sec): 6.14 - samples/sec: 3127.52 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:06:05,732 epoch 10 - iter 70/146 - loss 0.01020773 - time (sec): 7.54 - samples/sec: 3083.99 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:06:07,104 epoch 10 - iter 84/146 - loss 0.01045028 - time (sec): 8.91 - samples/sec: 3031.46 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:06:08,416 epoch 10 - iter 98/146 - loss 0.00963056 - time (sec): 10.22 - samples/sec: 3020.03 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:06:09,777 epoch 10 - iter 112/146 - loss 0.01023756 - time (sec): 11.59 - samples/sec: 2991.65 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:06:11,036 epoch 10 - iter 126/146 - loss 0.01175292 - time (sec): 12.84 - samples/sec: 2989.54 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:06:12,247 epoch 10 - iter 140/146 - loss 0.01174241 - time (sec): 14.06 - samples/sec: 3014.32 - lr: 0.000000 - momentum: 0.000000
2023-10-16 18:06:12,946 ----------------------------------------------------------------------------------------------------
2023-10-16 18:06:12,946 EPOCH 10 done: loss 0.0119 - lr: 0.000000
2023-10-16 18:06:14,485 DEV : loss 0.16552899777889252 - f1-score (micro avg) 0.7468
2023-10-16 18:06:14,965 ----------------------------------------------------------------------------------------------------
2023-10-16 18:06:14,967 Loading model from best epoch ...
2023-10-16 18:06:16,437 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-16 18:06:18,838
Results:
- F-score (micro) 0.7565
- F-score (macro) 0.6684
- Accuracy 0.6286
By class:
precision recall f1-score support
PER 0.7958 0.8621 0.8276 348
LOC 0.6524 0.8199 0.7267 261
ORG 0.4773 0.4038 0.4375 52
HumanProd 0.6818 0.6818 0.6818 22
micro avg 0.7134 0.8053 0.7565 683
macro avg 0.6518 0.6919 0.6684 683
weighted avg 0.7131 0.8053 0.7546 683
2023-10-16 18:06:18,838 ----------------------------------------------------------------------------------------------------