stefan-it's picture
Upload ./training.log with huggingface_hub
7b5f72f
2023-10-25 16:14:52,812 ----------------------------------------------------------------------------------------------------
2023-10-25 16:14:52,813 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 16:14:52,814 ----------------------------------------------------------------------------------------------------
2023-10-25 16:14:52,814 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-25 16:14:52,814 ----------------------------------------------------------------------------------------------------
2023-10-25 16:14:52,814 Train: 7142 sentences
2023-10-25 16:14:52,814 (train_with_dev=False, train_with_test=False)
2023-10-25 16:14:52,814 ----------------------------------------------------------------------------------------------------
2023-10-25 16:14:52,814 Training Params:
2023-10-25 16:14:52,814 - learning_rate: "5e-05"
2023-10-25 16:14:52,814 - mini_batch_size: "8"
2023-10-25 16:14:52,815 - max_epochs: "10"
2023-10-25 16:14:52,815 - shuffle: "True"
2023-10-25 16:14:52,815 ----------------------------------------------------------------------------------------------------
2023-10-25 16:14:52,815 Plugins:
2023-10-25 16:14:52,815 - TensorboardLogger
2023-10-25 16:14:52,815 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 16:14:52,815 ----------------------------------------------------------------------------------------------------
2023-10-25 16:14:52,815 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 16:14:52,815 - metric: "('micro avg', 'f1-score')"
2023-10-25 16:14:52,815 ----------------------------------------------------------------------------------------------------
2023-10-25 16:14:52,815 Computation:
2023-10-25 16:14:52,815 - compute on device: cuda:0
2023-10-25 16:14:52,815 - embedding storage: none
2023-10-25 16:14:52,815 ----------------------------------------------------------------------------------------------------
2023-10-25 16:14:52,815 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-25 16:14:52,815 ----------------------------------------------------------------------------------------------------
2023-10-25 16:14:52,815 ----------------------------------------------------------------------------------------------------
2023-10-25 16:14:52,815 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 16:14:58,476 epoch 1 - iter 89/893 - loss 1.75249226 - time (sec): 5.66 - samples/sec: 4225.04 - lr: 0.000005 - momentum: 0.000000
2023-10-25 16:15:04,047 epoch 1 - iter 178/893 - loss 1.12358214 - time (sec): 11.23 - samples/sec: 4331.58 - lr: 0.000010 - momentum: 0.000000
2023-10-25 16:15:09,631 epoch 1 - iter 267/893 - loss 0.85862892 - time (sec): 16.81 - samples/sec: 4346.69 - lr: 0.000015 - momentum: 0.000000
2023-10-25 16:15:15,653 epoch 1 - iter 356/893 - loss 0.68950207 - time (sec): 22.84 - samples/sec: 4359.93 - lr: 0.000020 - momentum: 0.000000
2023-10-25 16:15:21,583 epoch 1 - iter 445/893 - loss 0.59145739 - time (sec): 28.77 - samples/sec: 4287.67 - lr: 0.000025 - momentum: 0.000000
2023-10-25 16:15:27,239 epoch 1 - iter 534/893 - loss 0.52114142 - time (sec): 34.42 - samples/sec: 4306.58 - lr: 0.000030 - momentum: 0.000000
2023-10-25 16:15:32,900 epoch 1 - iter 623/893 - loss 0.46529929 - time (sec): 40.08 - samples/sec: 4352.66 - lr: 0.000035 - momentum: 0.000000
2023-10-25 16:15:38,561 epoch 1 - iter 712/893 - loss 0.42604012 - time (sec): 45.75 - samples/sec: 4351.79 - lr: 0.000040 - momentum: 0.000000
2023-10-25 16:15:44,304 epoch 1 - iter 801/893 - loss 0.39523829 - time (sec): 51.49 - samples/sec: 4338.77 - lr: 0.000045 - momentum: 0.000000
2023-10-25 16:15:49,757 epoch 1 - iter 890/893 - loss 0.37145848 - time (sec): 56.94 - samples/sec: 4358.82 - lr: 0.000050 - momentum: 0.000000
2023-10-25 16:15:49,923 ----------------------------------------------------------------------------------------------------
2023-10-25 16:15:49,923 EPOCH 1 done: loss 0.3708 - lr: 0.000050
2023-10-25 16:15:53,793 DEV : loss 0.11669864505529404 - f1-score (micro avg) 0.7017
2023-10-25 16:15:53,817 saving best model
2023-10-25 16:15:54,256 ----------------------------------------------------------------------------------------------------
2023-10-25 16:16:00,060 epoch 2 - iter 89/893 - loss 0.09474515 - time (sec): 5.80 - samples/sec: 4339.83 - lr: 0.000049 - momentum: 0.000000
2023-10-25 16:16:05,594 epoch 2 - iter 178/893 - loss 0.10318396 - time (sec): 11.34 - samples/sec: 4160.10 - lr: 0.000049 - momentum: 0.000000
2023-10-25 16:16:11,390 epoch 2 - iter 267/893 - loss 0.10509808 - time (sec): 17.13 - samples/sec: 4224.80 - lr: 0.000048 - momentum: 0.000000
2023-10-25 16:16:17,233 epoch 2 - iter 356/893 - loss 0.10228761 - time (sec): 22.97 - samples/sec: 4354.62 - lr: 0.000048 - momentum: 0.000000
2023-10-25 16:16:22,957 epoch 2 - iter 445/893 - loss 0.10257400 - time (sec): 28.70 - samples/sec: 4341.65 - lr: 0.000047 - momentum: 0.000000
2023-10-25 16:16:28,542 epoch 2 - iter 534/893 - loss 0.10603402 - time (sec): 34.28 - samples/sec: 4337.53 - lr: 0.000047 - momentum: 0.000000
2023-10-25 16:16:34,303 epoch 2 - iter 623/893 - loss 0.10506118 - time (sec): 40.05 - samples/sec: 4338.75 - lr: 0.000046 - momentum: 0.000000
2023-10-25 16:16:39,929 epoch 2 - iter 712/893 - loss 0.10425592 - time (sec): 45.67 - samples/sec: 4328.82 - lr: 0.000046 - momentum: 0.000000
2023-10-25 16:16:45,809 epoch 2 - iter 801/893 - loss 0.10345125 - time (sec): 51.55 - samples/sec: 4331.43 - lr: 0.000045 - momentum: 0.000000
2023-10-25 16:16:51,516 epoch 2 - iter 890/893 - loss 0.10408064 - time (sec): 57.26 - samples/sec: 4330.98 - lr: 0.000044 - momentum: 0.000000
2023-10-25 16:16:51,691 ----------------------------------------------------------------------------------------------------
2023-10-25 16:16:51,692 EPOCH 2 done: loss 0.1040 - lr: 0.000044
2023-10-25 16:16:55,679 DEV : loss 0.10159432888031006 - f1-score (micro avg) 0.7568
2023-10-25 16:16:55,697 saving best model
2023-10-25 16:16:56,349 ----------------------------------------------------------------------------------------------------
2023-10-25 16:17:01,901 epoch 3 - iter 89/893 - loss 0.06874626 - time (sec): 5.55 - samples/sec: 4285.04 - lr: 0.000044 - momentum: 0.000000
2023-10-25 16:17:07,545 epoch 3 - iter 178/893 - loss 0.06298047 - time (sec): 11.19 - samples/sec: 4308.04 - lr: 0.000043 - momentum: 0.000000
2023-10-25 16:17:12,918 epoch 3 - iter 267/893 - loss 0.06537415 - time (sec): 16.57 - samples/sec: 4389.94 - lr: 0.000043 - momentum: 0.000000
2023-10-25 16:17:18,386 epoch 3 - iter 356/893 - loss 0.06543963 - time (sec): 22.03 - samples/sec: 4360.35 - lr: 0.000042 - momentum: 0.000000
2023-10-25 16:17:23,822 epoch 3 - iter 445/893 - loss 0.06548496 - time (sec): 27.47 - samples/sec: 4365.39 - lr: 0.000042 - momentum: 0.000000
2023-10-25 16:17:29,478 epoch 3 - iter 534/893 - loss 0.06655645 - time (sec): 33.13 - samples/sec: 4387.73 - lr: 0.000041 - momentum: 0.000000
2023-10-25 16:17:35,152 epoch 3 - iter 623/893 - loss 0.06423904 - time (sec): 38.80 - samples/sec: 4396.78 - lr: 0.000041 - momentum: 0.000000
2023-10-25 16:17:41,870 epoch 3 - iter 712/893 - loss 0.06350497 - time (sec): 45.52 - samples/sec: 4335.38 - lr: 0.000040 - momentum: 0.000000
2023-10-25 16:17:47,437 epoch 3 - iter 801/893 - loss 0.06463976 - time (sec): 51.09 - samples/sec: 4361.36 - lr: 0.000039 - momentum: 0.000000
2023-10-25 16:17:52,859 epoch 3 - iter 890/893 - loss 0.06360620 - time (sec): 56.51 - samples/sec: 4387.69 - lr: 0.000039 - momentum: 0.000000
2023-10-25 16:17:53,023 ----------------------------------------------------------------------------------------------------
2023-10-25 16:17:53,024 EPOCH 3 done: loss 0.0636 - lr: 0.000039
2023-10-25 16:17:57,577 DEV : loss 0.12474026530981064 - f1-score (micro avg) 0.7885
2023-10-25 16:17:57,598 saving best model
2023-10-25 16:17:58,246 ----------------------------------------------------------------------------------------------------
2023-10-25 16:18:03,755 epoch 4 - iter 89/893 - loss 0.04272137 - time (sec): 5.51 - samples/sec: 4315.01 - lr: 0.000038 - momentum: 0.000000
2023-10-25 16:18:09,537 epoch 4 - iter 178/893 - loss 0.04287044 - time (sec): 11.29 - samples/sec: 4269.15 - lr: 0.000038 - momentum: 0.000000
2023-10-25 16:18:15,017 epoch 4 - iter 267/893 - loss 0.04645572 - time (sec): 16.77 - samples/sec: 4258.22 - lr: 0.000037 - momentum: 0.000000
2023-10-25 16:18:20,857 epoch 4 - iter 356/893 - loss 0.04425794 - time (sec): 22.61 - samples/sec: 4315.23 - lr: 0.000037 - momentum: 0.000000
2023-10-25 16:18:26,448 epoch 4 - iter 445/893 - loss 0.04423257 - time (sec): 28.20 - samples/sec: 4314.59 - lr: 0.000036 - momentum: 0.000000
2023-10-25 16:18:32,383 epoch 4 - iter 534/893 - loss 0.04415217 - time (sec): 34.14 - samples/sec: 4309.21 - lr: 0.000036 - momentum: 0.000000
2023-10-25 16:18:38,252 epoch 4 - iter 623/893 - loss 0.04351186 - time (sec): 40.00 - samples/sec: 4308.17 - lr: 0.000035 - momentum: 0.000000
2023-10-25 16:18:44,167 epoch 4 - iter 712/893 - loss 0.04498748 - time (sec): 45.92 - samples/sec: 4323.66 - lr: 0.000034 - momentum: 0.000000
2023-10-25 16:18:50,010 epoch 4 - iter 801/893 - loss 0.04527081 - time (sec): 51.76 - samples/sec: 4300.32 - lr: 0.000034 - momentum: 0.000000
2023-10-25 16:18:55,870 epoch 4 - iter 890/893 - loss 0.04646286 - time (sec): 57.62 - samples/sec: 4307.11 - lr: 0.000033 - momentum: 0.000000
2023-10-25 16:18:56,037 ----------------------------------------------------------------------------------------------------
2023-10-25 16:18:56,037 EPOCH 4 done: loss 0.0466 - lr: 0.000033
2023-10-25 16:19:00,739 DEV : loss 0.14563852548599243 - f1-score (micro avg) 0.7995
2023-10-25 16:19:00,759 saving best model
2023-10-25 16:19:01,394 ----------------------------------------------------------------------------------------------------
2023-10-25 16:19:07,059 epoch 5 - iter 89/893 - loss 0.03479555 - time (sec): 5.66 - samples/sec: 4283.20 - lr: 0.000033 - momentum: 0.000000
2023-10-25 16:19:12,717 epoch 5 - iter 178/893 - loss 0.03717067 - time (sec): 11.32 - samples/sec: 4246.28 - lr: 0.000032 - momentum: 0.000000
2023-10-25 16:19:18,650 epoch 5 - iter 267/893 - loss 0.03548914 - time (sec): 17.25 - samples/sec: 4224.53 - lr: 0.000032 - momentum: 0.000000
2023-10-25 16:19:24,459 epoch 5 - iter 356/893 - loss 0.03481599 - time (sec): 23.06 - samples/sec: 4207.93 - lr: 0.000031 - momentum: 0.000000
2023-10-25 16:19:30,023 epoch 5 - iter 445/893 - loss 0.03447645 - time (sec): 28.63 - samples/sec: 4210.77 - lr: 0.000031 - momentum: 0.000000
2023-10-25 16:19:35,602 epoch 5 - iter 534/893 - loss 0.03423318 - time (sec): 34.20 - samples/sec: 4226.86 - lr: 0.000030 - momentum: 0.000000
2023-10-25 16:19:41,616 epoch 5 - iter 623/893 - loss 0.03457306 - time (sec): 40.22 - samples/sec: 4242.80 - lr: 0.000029 - momentum: 0.000000
2023-10-25 16:19:47,303 epoch 5 - iter 712/893 - loss 0.03473284 - time (sec): 45.91 - samples/sec: 4253.72 - lr: 0.000029 - momentum: 0.000000
2023-10-25 16:19:53,078 epoch 5 - iter 801/893 - loss 0.03473047 - time (sec): 51.68 - samples/sec: 4308.69 - lr: 0.000028 - momentum: 0.000000
2023-10-25 16:19:58,731 epoch 5 - iter 890/893 - loss 0.03463942 - time (sec): 57.33 - samples/sec: 4328.07 - lr: 0.000028 - momentum: 0.000000
2023-10-25 16:19:58,905 ----------------------------------------------------------------------------------------------------
2023-10-25 16:19:58,905 EPOCH 5 done: loss 0.0348 - lr: 0.000028
2023-10-25 16:20:03,118 DEV : loss 0.15929269790649414 - f1-score (micro avg) 0.8035
2023-10-25 16:20:03,139 saving best model
2023-10-25 16:20:03,805 ----------------------------------------------------------------------------------------------------
2023-10-25 16:20:10,697 epoch 6 - iter 89/893 - loss 0.02332392 - time (sec): 6.89 - samples/sec: 3453.26 - lr: 0.000027 - momentum: 0.000000
2023-10-25 16:20:16,378 epoch 6 - iter 178/893 - loss 0.02637640 - time (sec): 12.57 - samples/sec: 3883.46 - lr: 0.000027 - momentum: 0.000000
2023-10-25 16:20:22,209 epoch 6 - iter 267/893 - loss 0.02903353 - time (sec): 18.40 - samples/sec: 3994.50 - lr: 0.000026 - momentum: 0.000000
2023-10-25 16:20:28,018 epoch 6 - iter 356/893 - loss 0.02755762 - time (sec): 24.21 - samples/sec: 4059.55 - lr: 0.000026 - momentum: 0.000000
2023-10-25 16:20:34,069 epoch 6 - iter 445/893 - loss 0.02857473 - time (sec): 30.26 - samples/sec: 4082.88 - lr: 0.000025 - momentum: 0.000000
2023-10-25 16:20:39,839 epoch 6 - iter 534/893 - loss 0.02933559 - time (sec): 36.03 - samples/sec: 4102.15 - lr: 0.000024 - momentum: 0.000000
2023-10-25 16:20:45,844 epoch 6 - iter 623/893 - loss 0.02930823 - time (sec): 42.04 - samples/sec: 4110.00 - lr: 0.000024 - momentum: 0.000000
2023-10-25 16:20:51,845 epoch 6 - iter 712/893 - loss 0.02914794 - time (sec): 48.04 - samples/sec: 4138.01 - lr: 0.000023 - momentum: 0.000000
2023-10-25 16:20:57,419 epoch 6 - iter 801/893 - loss 0.02846516 - time (sec): 53.61 - samples/sec: 4158.83 - lr: 0.000023 - momentum: 0.000000
2023-10-25 16:21:03,414 epoch 6 - iter 890/893 - loss 0.02806324 - time (sec): 59.61 - samples/sec: 4161.26 - lr: 0.000022 - momentum: 0.000000
2023-10-25 16:21:03,591 ----------------------------------------------------------------------------------------------------
2023-10-25 16:21:03,592 EPOCH 6 done: loss 0.0281 - lr: 0.000022
2023-10-25 16:21:07,987 DEV : loss 0.17629361152648926 - f1-score (micro avg) 0.8051
2023-10-25 16:21:08,011 saving best model
2023-10-25 16:21:08,670 ----------------------------------------------------------------------------------------------------
2023-10-25 16:21:14,665 epoch 7 - iter 89/893 - loss 0.01807935 - time (sec): 5.99 - samples/sec: 4333.48 - lr: 0.000022 - momentum: 0.000000
2023-10-25 16:21:20,552 epoch 7 - iter 178/893 - loss 0.01702157 - time (sec): 11.88 - samples/sec: 4331.32 - lr: 0.000021 - momentum: 0.000000
2023-10-25 16:21:26,365 epoch 7 - iter 267/893 - loss 0.01856450 - time (sec): 17.69 - samples/sec: 4348.20 - lr: 0.000021 - momentum: 0.000000
2023-10-25 16:21:31,968 epoch 7 - iter 356/893 - loss 0.01843919 - time (sec): 23.30 - samples/sec: 4380.43 - lr: 0.000020 - momentum: 0.000000
2023-10-25 16:21:37,680 epoch 7 - iter 445/893 - loss 0.01921349 - time (sec): 29.01 - samples/sec: 4339.81 - lr: 0.000019 - momentum: 0.000000
2023-10-25 16:21:43,467 epoch 7 - iter 534/893 - loss 0.01975706 - time (sec): 34.79 - samples/sec: 4351.56 - lr: 0.000019 - momentum: 0.000000
2023-10-25 16:21:49,103 epoch 7 - iter 623/893 - loss 0.01986136 - time (sec): 40.43 - samples/sec: 4335.92 - lr: 0.000018 - momentum: 0.000000
2023-10-25 16:21:54,799 epoch 7 - iter 712/893 - loss 0.01928109 - time (sec): 46.13 - samples/sec: 4326.33 - lr: 0.000018 - momentum: 0.000000
2023-10-25 16:22:00,710 epoch 7 - iter 801/893 - loss 0.01923095 - time (sec): 52.04 - samples/sec: 4317.89 - lr: 0.000017 - momentum: 0.000000
2023-10-25 16:22:06,598 epoch 7 - iter 890/893 - loss 0.01918290 - time (sec): 57.92 - samples/sec: 4282.24 - lr: 0.000017 - momentum: 0.000000
2023-10-25 16:22:06,777 ----------------------------------------------------------------------------------------------------
2023-10-25 16:22:06,777 EPOCH 7 done: loss 0.0192 - lr: 0.000017
2023-10-25 16:22:11,585 DEV : loss 0.19473356008529663 - f1-score (micro avg) 0.8021
2023-10-25 16:22:11,607 ----------------------------------------------------------------------------------------------------
2023-10-25 16:22:17,531 epoch 8 - iter 89/893 - loss 0.01959540 - time (sec): 5.92 - samples/sec: 4201.88 - lr: 0.000016 - momentum: 0.000000
2023-10-25 16:22:23,194 epoch 8 - iter 178/893 - loss 0.01512738 - time (sec): 11.59 - samples/sec: 4143.22 - lr: 0.000016 - momentum: 0.000000
2023-10-25 16:22:29,327 epoch 8 - iter 267/893 - loss 0.01384824 - time (sec): 17.72 - samples/sec: 4215.90 - lr: 0.000015 - momentum: 0.000000
2023-10-25 16:22:35,277 epoch 8 - iter 356/893 - loss 0.01370296 - time (sec): 23.67 - samples/sec: 4183.57 - lr: 0.000014 - momentum: 0.000000
2023-10-25 16:22:41,312 epoch 8 - iter 445/893 - loss 0.01389069 - time (sec): 29.70 - samples/sec: 4172.18 - lr: 0.000014 - momentum: 0.000000
2023-10-25 16:22:47,286 epoch 8 - iter 534/893 - loss 0.01477066 - time (sec): 35.68 - samples/sec: 4207.64 - lr: 0.000013 - momentum: 0.000000
2023-10-25 16:22:53,170 epoch 8 - iter 623/893 - loss 0.01437157 - time (sec): 41.56 - samples/sec: 4202.76 - lr: 0.000013 - momentum: 0.000000
2023-10-25 16:22:59,064 epoch 8 - iter 712/893 - loss 0.01397568 - time (sec): 47.45 - samples/sec: 4207.65 - lr: 0.000012 - momentum: 0.000000
2023-10-25 16:23:04,767 epoch 8 - iter 801/893 - loss 0.01390187 - time (sec): 53.16 - samples/sec: 4197.45 - lr: 0.000012 - momentum: 0.000000
2023-10-25 16:23:10,581 epoch 8 - iter 890/893 - loss 0.01427908 - time (sec): 58.97 - samples/sec: 4205.45 - lr: 0.000011 - momentum: 0.000000
2023-10-25 16:23:10,755 ----------------------------------------------------------------------------------------------------
2023-10-25 16:23:10,756 EPOCH 8 done: loss 0.0143 - lr: 0.000011
2023-10-25 16:23:15,803 DEV : loss 0.1920691877603531 - f1-score (micro avg) 0.8
2023-10-25 16:23:15,826 ----------------------------------------------------------------------------------------------------
2023-10-25 16:23:21,740 epoch 9 - iter 89/893 - loss 0.00910264 - time (sec): 5.91 - samples/sec: 4529.19 - lr: 0.000011 - momentum: 0.000000
2023-10-25 16:23:27,673 epoch 9 - iter 178/893 - loss 0.01115266 - time (sec): 11.85 - samples/sec: 4385.76 - lr: 0.000010 - momentum: 0.000000
2023-10-25 16:23:33,568 epoch 9 - iter 267/893 - loss 0.00963172 - time (sec): 17.74 - samples/sec: 4396.12 - lr: 0.000009 - momentum: 0.000000
2023-10-25 16:23:39,305 epoch 9 - iter 356/893 - loss 0.00981929 - time (sec): 23.48 - samples/sec: 4312.64 - lr: 0.000009 - momentum: 0.000000
2023-10-25 16:23:44,931 epoch 9 - iter 445/893 - loss 0.01013183 - time (sec): 29.10 - samples/sec: 4248.67 - lr: 0.000008 - momentum: 0.000000
2023-10-25 16:23:50,521 epoch 9 - iter 534/893 - loss 0.01006792 - time (sec): 34.69 - samples/sec: 4230.50 - lr: 0.000008 - momentum: 0.000000
2023-10-25 16:23:56,224 epoch 9 - iter 623/893 - loss 0.00966847 - time (sec): 40.40 - samples/sec: 4263.72 - lr: 0.000007 - momentum: 0.000000
2023-10-25 16:24:01,897 epoch 9 - iter 712/893 - loss 0.00963917 - time (sec): 46.07 - samples/sec: 4278.88 - lr: 0.000007 - momentum: 0.000000
2023-10-25 16:24:07,419 epoch 9 - iter 801/893 - loss 0.00953469 - time (sec): 51.59 - samples/sec: 4291.48 - lr: 0.000006 - momentum: 0.000000
2023-10-25 16:24:13,382 epoch 9 - iter 890/893 - loss 0.00931090 - time (sec): 57.55 - samples/sec: 4310.21 - lr: 0.000006 - momentum: 0.000000
2023-10-25 16:24:13,561 ----------------------------------------------------------------------------------------------------
2023-10-25 16:24:13,562 EPOCH 9 done: loss 0.0093 - lr: 0.000006
2023-10-25 16:24:17,665 DEV : loss 0.21395151317119598 - f1-score (micro avg) 0.7989
2023-10-25 16:24:17,686 ----------------------------------------------------------------------------------------------------
2023-10-25 16:24:23,693 epoch 10 - iter 89/893 - loss 0.00447968 - time (sec): 6.01 - samples/sec: 4352.32 - lr: 0.000005 - momentum: 0.000000
2023-10-25 16:24:29,482 epoch 10 - iter 178/893 - loss 0.00644700 - time (sec): 11.79 - samples/sec: 4259.83 - lr: 0.000004 - momentum: 0.000000
2023-10-25 16:24:35,198 epoch 10 - iter 267/893 - loss 0.00651104 - time (sec): 17.51 - samples/sec: 4319.85 - lr: 0.000004 - momentum: 0.000000
2023-10-25 16:24:41,020 epoch 10 - iter 356/893 - loss 0.00686979 - time (sec): 23.33 - samples/sec: 4260.72 - lr: 0.000003 - momentum: 0.000000
2023-10-25 16:24:46,929 epoch 10 - iter 445/893 - loss 0.00690950 - time (sec): 29.24 - samples/sec: 4227.17 - lr: 0.000003 - momentum: 0.000000
2023-10-25 16:24:52,949 epoch 10 - iter 534/893 - loss 0.00654009 - time (sec): 35.26 - samples/sec: 4238.85 - lr: 0.000002 - momentum: 0.000000
2023-10-25 16:24:58,798 epoch 10 - iter 623/893 - loss 0.00647660 - time (sec): 41.11 - samples/sec: 4261.62 - lr: 0.000002 - momentum: 0.000000
2023-10-25 16:25:04,630 epoch 10 - iter 712/893 - loss 0.00632372 - time (sec): 46.94 - samples/sec: 4262.36 - lr: 0.000001 - momentum: 0.000000
2023-10-25 16:25:10,334 epoch 10 - iter 801/893 - loss 0.00654215 - time (sec): 52.65 - samples/sec: 4249.10 - lr: 0.000001 - momentum: 0.000000
2023-10-25 16:25:16,167 epoch 10 - iter 890/893 - loss 0.00615978 - time (sec): 58.48 - samples/sec: 4245.00 - lr: 0.000000 - momentum: 0.000000
2023-10-25 16:25:16,341 ----------------------------------------------------------------------------------------------------
2023-10-25 16:25:16,341 EPOCH 10 done: loss 0.0061 - lr: 0.000000
2023-10-25 16:25:21,059 DEV : loss 0.2173573076725006 - f1-score (micro avg) 0.8064
2023-10-25 16:25:21,080 saving best model
2023-10-25 16:25:22,110 ----------------------------------------------------------------------------------------------------
2023-10-25 16:25:22,112 Loading model from best epoch ...
2023-10-25 16:25:23,975 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 16:25:36,200
Results:
- F-score (micro) 0.6851
- F-score (macro) 0.6194
- Accuracy 0.537
By class:
precision recall f1-score support
LOC 0.6678 0.6941 0.6807 1095
PER 0.7727 0.7658 0.7692 1012
ORG 0.4590 0.5490 0.5000 357
HumanProd 0.4138 0.7273 0.5275 33
micro avg 0.6683 0.7028 0.6851 2497
macro avg 0.5783 0.6840 0.6194 2497
weighted avg 0.6771 0.7028 0.6887 2497
2023-10-25 16:25:36,200 ----------------------------------------------------------------------------------------------------