stefan-it's picture
Upload folder using huggingface_hub
128ab49
2023-10-18 14:43:30,985 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:30,986 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 14:43:30,986 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:30,986 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-18 14:43:30,986 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:30,986 Train: 1100 sentences
2023-10-18 14:43:30,986 (train_with_dev=False, train_with_test=False)
2023-10-18 14:43:30,986 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:30,986 Training Params:
2023-10-18 14:43:30,986 - learning_rate: "5e-05"
2023-10-18 14:43:30,986 - mini_batch_size: "4"
2023-10-18 14:43:30,986 - max_epochs: "10"
2023-10-18 14:43:30,986 - shuffle: "True"
2023-10-18 14:43:30,986 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:30,986 Plugins:
2023-10-18 14:43:30,986 - TensorboardLogger
2023-10-18 14:43:30,986 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 14:43:30,986 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:30,986 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 14:43:30,986 - metric: "('micro avg', 'f1-score')"
2023-10-18 14:43:30,986 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:30,987 Computation:
2023-10-18 14:43:30,987 - compute on device: cuda:0
2023-10-18 14:43:30,987 - embedding storage: none
2023-10-18 14:43:30,987 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:30,987 Model training base path: "hmbench-ajmc/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-18 14:43:30,987 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:30,987 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:30,987 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 14:43:31,391 epoch 1 - iter 27/275 - loss 4.03243994 - time (sec): 0.40 - samples/sec: 4913.90 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:43:31,799 epoch 1 - iter 54/275 - loss 3.94570068 - time (sec): 0.81 - samples/sec: 5336.44 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:43:32,203 epoch 1 - iter 81/275 - loss 3.78668884 - time (sec): 1.22 - samples/sec: 5284.72 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:43:32,640 epoch 1 - iter 108/275 - loss 3.54407889 - time (sec): 1.65 - samples/sec: 5346.64 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:43:33,050 epoch 1 - iter 135/275 - loss 3.32952951 - time (sec): 2.06 - samples/sec: 5377.45 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:43:33,459 epoch 1 - iter 162/275 - loss 3.07492688 - time (sec): 2.47 - samples/sec: 5437.85 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:43:33,863 epoch 1 - iter 189/275 - loss 2.82838245 - time (sec): 2.88 - samples/sec: 5460.46 - lr: 0.000034 - momentum: 0.000000
2023-10-18 14:43:34,274 epoch 1 - iter 216/275 - loss 2.58190842 - time (sec): 3.29 - samples/sec: 5557.78 - lr: 0.000039 - momentum: 0.000000
2023-10-18 14:43:34,662 epoch 1 - iter 243/275 - loss 2.42149492 - time (sec): 3.67 - samples/sec: 5509.32 - lr: 0.000044 - momentum: 0.000000
2023-10-18 14:43:35,064 epoch 1 - iter 270/275 - loss 2.28777710 - time (sec): 4.08 - samples/sec: 5502.28 - lr: 0.000049 - momentum: 0.000000
2023-10-18 14:43:35,135 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:35,135 EPOCH 1 done: loss 2.2647 - lr: 0.000049
2023-10-18 14:43:35,389 DEV : loss 0.8980287313461304 - f1-score (micro avg) 0.0
2023-10-18 14:43:35,394 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:35,861 epoch 2 - iter 27/275 - loss 0.78134023 - time (sec): 0.47 - samples/sec: 5452.93 - lr: 0.000049 - momentum: 0.000000
2023-10-18 14:43:36,290 epoch 2 - iter 54/275 - loss 0.81836184 - time (sec): 0.89 - samples/sec: 5236.01 - lr: 0.000049 - momentum: 0.000000
2023-10-18 14:43:36,691 epoch 2 - iter 81/275 - loss 0.86443923 - time (sec): 1.30 - samples/sec: 5364.83 - lr: 0.000048 - momentum: 0.000000
2023-10-18 14:43:37,099 epoch 2 - iter 108/275 - loss 0.88101045 - time (sec): 1.70 - samples/sec: 5208.15 - lr: 0.000048 - momentum: 0.000000
2023-10-18 14:43:37,496 epoch 2 - iter 135/275 - loss 0.85790201 - time (sec): 2.10 - samples/sec: 5227.88 - lr: 0.000047 - momentum: 0.000000
2023-10-18 14:43:37,906 epoch 2 - iter 162/275 - loss 0.83927798 - time (sec): 2.51 - samples/sec: 5311.25 - lr: 0.000047 - momentum: 0.000000
2023-10-18 14:43:38,301 epoch 2 - iter 189/275 - loss 0.83267514 - time (sec): 2.91 - samples/sec: 5255.59 - lr: 0.000046 - momentum: 0.000000
2023-10-18 14:43:38,715 epoch 2 - iter 216/275 - loss 0.82029440 - time (sec): 3.32 - samples/sec: 5332.04 - lr: 0.000046 - momentum: 0.000000
2023-10-18 14:43:39,123 epoch 2 - iter 243/275 - loss 0.80751106 - time (sec): 3.73 - samples/sec: 5387.77 - lr: 0.000045 - momentum: 0.000000
2023-10-18 14:43:39,540 epoch 2 - iter 270/275 - loss 0.78403155 - time (sec): 4.14 - samples/sec: 5409.28 - lr: 0.000045 - momentum: 0.000000
2023-10-18 14:43:39,615 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:39,615 EPOCH 2 done: loss 0.7873 - lr: 0.000045
2023-10-18 14:43:39,977 DEV : loss 0.5882683396339417 - f1-score (micro avg) 0.1087
2023-10-18 14:43:39,983 saving best model
2023-10-18 14:43:40,015 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:40,418 epoch 3 - iter 27/275 - loss 0.69388767 - time (sec): 0.40 - samples/sec: 5755.49 - lr: 0.000044 - momentum: 0.000000
2023-10-18 14:43:40,819 epoch 3 - iter 54/275 - loss 0.69342049 - time (sec): 0.80 - samples/sec: 5998.04 - lr: 0.000043 - momentum: 0.000000
2023-10-18 14:43:41,240 epoch 3 - iter 81/275 - loss 0.64883093 - time (sec): 1.22 - samples/sec: 5789.52 - lr: 0.000043 - momentum: 0.000000
2023-10-18 14:43:41,648 epoch 3 - iter 108/275 - loss 0.61664166 - time (sec): 1.63 - samples/sec: 5743.95 - lr: 0.000042 - momentum: 0.000000
2023-10-18 14:43:42,064 epoch 3 - iter 135/275 - loss 0.60668920 - time (sec): 2.05 - samples/sec: 5671.59 - lr: 0.000042 - momentum: 0.000000
2023-10-18 14:43:42,469 epoch 3 - iter 162/275 - loss 0.58595365 - time (sec): 2.45 - samples/sec: 5607.56 - lr: 0.000041 - momentum: 0.000000
2023-10-18 14:43:42,880 epoch 3 - iter 189/275 - loss 0.58057106 - time (sec): 2.86 - samples/sec: 5533.10 - lr: 0.000041 - momentum: 0.000000
2023-10-18 14:43:43,289 epoch 3 - iter 216/275 - loss 0.57689387 - time (sec): 3.27 - samples/sec: 5527.63 - lr: 0.000040 - momentum: 0.000000
2023-10-18 14:43:43,714 epoch 3 - iter 243/275 - loss 0.57041795 - time (sec): 3.70 - samples/sec: 5516.54 - lr: 0.000040 - momentum: 0.000000
2023-10-18 14:43:44,128 epoch 3 - iter 270/275 - loss 0.57443584 - time (sec): 4.11 - samples/sec: 5454.11 - lr: 0.000039 - momentum: 0.000000
2023-10-18 14:43:44,205 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:44,205 EPOCH 3 done: loss 0.5706 - lr: 0.000039
2023-10-18 14:43:44,711 DEV : loss 0.40124934911727905 - f1-score (micro avg) 0.4511
2023-10-18 14:43:44,715 saving best model
2023-10-18 14:43:44,748 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:45,143 epoch 4 - iter 27/275 - loss 0.50055148 - time (sec): 0.39 - samples/sec: 5265.23 - lr: 0.000038 - momentum: 0.000000
2023-10-18 14:43:45,528 epoch 4 - iter 54/275 - loss 0.51970347 - time (sec): 0.78 - samples/sec: 5258.61 - lr: 0.000038 - momentum: 0.000000
2023-10-18 14:43:45,928 epoch 4 - iter 81/275 - loss 0.51732211 - time (sec): 1.18 - samples/sec: 5367.27 - lr: 0.000037 - momentum: 0.000000
2023-10-18 14:43:46,329 epoch 4 - iter 108/275 - loss 0.50755962 - time (sec): 1.58 - samples/sec: 5437.72 - lr: 0.000037 - momentum: 0.000000
2023-10-18 14:43:46,729 epoch 4 - iter 135/275 - loss 0.49628465 - time (sec): 1.98 - samples/sec: 5464.38 - lr: 0.000036 - momentum: 0.000000
2023-10-18 14:43:47,140 epoch 4 - iter 162/275 - loss 0.50076582 - time (sec): 2.39 - samples/sec: 5510.00 - lr: 0.000036 - momentum: 0.000000
2023-10-18 14:43:47,548 epoch 4 - iter 189/275 - loss 0.49510928 - time (sec): 2.80 - samples/sec: 5495.77 - lr: 0.000035 - momentum: 0.000000
2023-10-18 14:43:47,960 epoch 4 - iter 216/275 - loss 0.48738228 - time (sec): 3.21 - samples/sec: 5541.93 - lr: 0.000035 - momentum: 0.000000
2023-10-18 14:43:48,372 epoch 4 - iter 243/275 - loss 0.47576913 - time (sec): 3.62 - samples/sec: 5580.11 - lr: 0.000034 - momentum: 0.000000
2023-10-18 14:43:48,776 epoch 4 - iter 270/275 - loss 0.46872942 - time (sec): 4.03 - samples/sec: 5552.67 - lr: 0.000034 - momentum: 0.000000
2023-10-18 14:43:48,850 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:48,851 EPOCH 4 done: loss 0.4626 - lr: 0.000034
2023-10-18 14:43:49,220 DEV : loss 0.35367265343666077 - f1-score (micro avg) 0.5377
2023-10-18 14:43:49,224 saving best model
2023-10-18 14:43:49,259 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:49,669 epoch 5 - iter 27/275 - loss 0.43738549 - time (sec): 0.41 - samples/sec: 5917.56 - lr: 0.000033 - momentum: 0.000000
2023-10-18 14:43:50,082 epoch 5 - iter 54/275 - loss 0.39799114 - time (sec): 0.82 - samples/sec: 5796.05 - lr: 0.000032 - momentum: 0.000000
2023-10-18 14:43:50,487 epoch 5 - iter 81/275 - loss 0.39027648 - time (sec): 1.23 - samples/sec: 5653.57 - lr: 0.000032 - momentum: 0.000000
2023-10-18 14:43:50,890 epoch 5 - iter 108/275 - loss 0.40245857 - time (sec): 1.63 - samples/sec: 5667.85 - lr: 0.000031 - momentum: 0.000000
2023-10-18 14:43:51,307 epoch 5 - iter 135/275 - loss 0.39711821 - time (sec): 2.05 - samples/sec: 5615.86 - lr: 0.000031 - momentum: 0.000000
2023-10-18 14:43:51,717 epoch 5 - iter 162/275 - loss 0.40331691 - time (sec): 2.46 - samples/sec: 5526.32 - lr: 0.000030 - momentum: 0.000000
2023-10-18 14:43:52,119 epoch 5 - iter 189/275 - loss 0.41216604 - time (sec): 2.86 - samples/sec: 5580.31 - lr: 0.000030 - momentum: 0.000000
2023-10-18 14:43:52,518 epoch 5 - iter 216/275 - loss 0.40997604 - time (sec): 3.26 - samples/sec: 5523.34 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:43:52,920 epoch 5 - iter 243/275 - loss 0.40419920 - time (sec): 3.66 - samples/sec: 5461.61 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:43:53,330 epoch 5 - iter 270/275 - loss 0.40169560 - time (sec): 4.07 - samples/sec: 5492.40 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:43:53,407 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:53,407 EPOCH 5 done: loss 0.4014 - lr: 0.000028
2023-10-18 14:43:53,776 DEV : loss 0.30840688943862915 - f1-score (micro avg) 0.5665
2023-10-18 14:43:53,780 saving best model
2023-10-18 14:43:53,813 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:54,210 epoch 6 - iter 27/275 - loss 0.35894930 - time (sec): 0.40 - samples/sec: 5284.38 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:43:54,614 epoch 6 - iter 54/275 - loss 0.35477863 - time (sec): 0.80 - samples/sec: 5249.74 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:43:55,013 epoch 6 - iter 81/275 - loss 0.35568652 - time (sec): 1.20 - samples/sec: 5126.14 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:43:55,433 epoch 6 - iter 108/275 - loss 0.36290326 - time (sec): 1.62 - samples/sec: 5281.63 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:43:55,838 epoch 6 - iter 135/275 - loss 0.36415148 - time (sec): 2.02 - samples/sec: 5343.64 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:43:56,260 epoch 6 - iter 162/275 - loss 0.36307390 - time (sec): 2.45 - samples/sec: 5431.13 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:43:56,663 epoch 6 - iter 189/275 - loss 0.35302182 - time (sec): 2.85 - samples/sec: 5450.89 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:43:57,066 epoch 6 - iter 216/275 - loss 0.36076859 - time (sec): 3.25 - samples/sec: 5391.34 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:43:57,461 epoch 6 - iter 243/275 - loss 0.37265096 - time (sec): 3.65 - samples/sec: 5409.18 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:43:57,882 epoch 6 - iter 270/275 - loss 0.36682205 - time (sec): 4.07 - samples/sec: 5476.10 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:43:57,960 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:57,960 EPOCH 6 done: loss 0.3661 - lr: 0.000022
2023-10-18 14:43:58,329 DEV : loss 0.29626935720443726 - f1-score (micro avg) 0.5761
2023-10-18 14:43:58,334 saving best model
2023-10-18 14:43:58,367 ----------------------------------------------------------------------------------------------------
2023-10-18 14:43:58,769 epoch 7 - iter 27/275 - loss 0.41248552 - time (sec): 0.40 - samples/sec: 4765.46 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:43:59,174 epoch 7 - iter 54/275 - loss 0.37364688 - time (sec): 0.81 - samples/sec: 5248.47 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:43:59,591 epoch 7 - iter 81/275 - loss 0.34262108 - time (sec): 1.22 - samples/sec: 5527.43 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:43:59,997 epoch 7 - iter 108/275 - loss 0.34718865 - time (sec): 1.63 - samples/sec: 5545.54 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:44:00,393 epoch 7 - iter 135/275 - loss 0.35484492 - time (sec): 2.03 - samples/sec: 5487.81 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:44:00,814 epoch 7 - iter 162/275 - loss 0.35510658 - time (sec): 2.45 - samples/sec: 5413.03 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:44:01,233 epoch 7 - iter 189/275 - loss 0.35175496 - time (sec): 2.87 - samples/sec: 5474.12 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:44:01,639 epoch 7 - iter 216/275 - loss 0.35075373 - time (sec): 3.27 - samples/sec: 5448.81 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:44:02,053 epoch 7 - iter 243/275 - loss 0.34856969 - time (sec): 3.69 - samples/sec: 5489.67 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:44:02,462 epoch 7 - iter 270/275 - loss 0.34730130 - time (sec): 4.09 - samples/sec: 5468.14 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:44:02,539 ----------------------------------------------------------------------------------------------------
2023-10-18 14:44:02,539 EPOCH 7 done: loss 0.3465 - lr: 0.000017
2023-10-18 14:44:02,911 DEV : loss 0.27915582060813904 - f1-score (micro avg) 0.5926
2023-10-18 14:44:02,915 saving best model
2023-10-18 14:44:02,949 ----------------------------------------------------------------------------------------------------
2023-10-18 14:44:03,364 epoch 8 - iter 27/275 - loss 0.35848511 - time (sec): 0.41 - samples/sec: 5845.18 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:44:03,779 epoch 8 - iter 54/275 - loss 0.34775339 - time (sec): 0.83 - samples/sec: 5720.37 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:44:04,189 epoch 8 - iter 81/275 - loss 0.35859633 - time (sec): 1.24 - samples/sec: 5795.77 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:44:04,602 epoch 8 - iter 108/275 - loss 0.34543374 - time (sec): 1.65 - samples/sec: 5635.89 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:44:05,007 epoch 8 - iter 135/275 - loss 0.34767511 - time (sec): 2.06 - samples/sec: 5582.96 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:44:05,415 epoch 8 - iter 162/275 - loss 0.33705009 - time (sec): 2.47 - samples/sec: 5485.67 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:44:05,809 epoch 8 - iter 189/275 - loss 0.33645892 - time (sec): 2.86 - samples/sec: 5460.05 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:44:06,217 epoch 8 - iter 216/275 - loss 0.33479643 - time (sec): 3.27 - samples/sec: 5450.33 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:44:06,629 epoch 8 - iter 243/275 - loss 0.32980765 - time (sec): 3.68 - samples/sec: 5459.35 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:44:07,047 epoch 8 - iter 270/275 - loss 0.32263767 - time (sec): 4.10 - samples/sec: 5466.73 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:44:07,126 ----------------------------------------------------------------------------------------------------
2023-10-18 14:44:07,126 EPOCH 8 done: loss 0.3219 - lr: 0.000011
2023-10-18 14:44:07,495 DEV : loss 0.2702096104621887 - f1-score (micro avg) 0.5993
2023-10-18 14:44:07,499 saving best model
2023-10-18 14:44:07,532 ----------------------------------------------------------------------------------------------------
2023-10-18 14:44:07,940 epoch 9 - iter 27/275 - loss 0.34018936 - time (sec): 0.41 - samples/sec: 5450.01 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:44:08,343 epoch 9 - iter 54/275 - loss 0.31793969 - time (sec): 0.81 - samples/sec: 5405.90 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:44:08,750 epoch 9 - iter 81/275 - loss 0.31206053 - time (sec): 1.22 - samples/sec: 5313.60 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:44:09,159 epoch 9 - iter 108/275 - loss 0.31170653 - time (sec): 1.63 - samples/sec: 5239.50 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:44:09,567 epoch 9 - iter 135/275 - loss 0.31614410 - time (sec): 2.03 - samples/sec: 5181.67 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:44:09,974 epoch 9 - iter 162/275 - loss 0.31441368 - time (sec): 2.44 - samples/sec: 5242.22 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:44:10,383 epoch 9 - iter 189/275 - loss 0.32431431 - time (sec): 2.85 - samples/sec: 5336.02 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:44:10,802 epoch 9 - iter 216/275 - loss 0.31466197 - time (sec): 3.27 - samples/sec: 5362.79 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:44:11,235 epoch 9 - iter 243/275 - loss 0.31529849 - time (sec): 3.70 - samples/sec: 5439.49 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:44:11,651 epoch 9 - iter 270/275 - loss 0.31350265 - time (sec): 4.12 - samples/sec: 5436.90 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:44:11,726 ----------------------------------------------------------------------------------------------------
2023-10-18 14:44:11,726 EPOCH 9 done: loss 0.3157 - lr: 0.000006
2023-10-18 14:44:12,103 DEV : loss 0.2669928967952728 - f1-score (micro avg) 0.6061
2023-10-18 14:44:12,108 saving best model
2023-10-18 14:44:12,146 ----------------------------------------------------------------------------------------------------
2023-10-18 14:44:12,559 epoch 10 - iter 27/275 - loss 0.33483529 - time (sec): 0.41 - samples/sec: 5345.07 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:44:12,965 epoch 10 - iter 54/275 - loss 0.32748772 - time (sec): 0.82 - samples/sec: 5641.30 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:44:13,369 epoch 10 - iter 81/275 - loss 0.35425091 - time (sec): 1.22 - samples/sec: 5676.62 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:44:13,761 epoch 10 - iter 108/275 - loss 0.32724076 - time (sec): 1.61 - samples/sec: 5580.81 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:44:14,161 epoch 10 - iter 135/275 - loss 0.33067401 - time (sec): 2.01 - samples/sec: 5741.54 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:44:14,580 epoch 10 - iter 162/275 - loss 0.32389891 - time (sec): 2.43 - samples/sec: 5632.88 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:44:14,991 epoch 10 - iter 189/275 - loss 0.32167552 - time (sec): 2.84 - samples/sec: 5561.38 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:44:15,395 epoch 10 - iter 216/275 - loss 0.31850865 - time (sec): 3.25 - samples/sec: 5541.60 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:44:15,790 epoch 10 - iter 243/275 - loss 0.31510525 - time (sec): 3.64 - samples/sec: 5537.47 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:44:16,197 epoch 10 - iter 270/275 - loss 0.31451221 - time (sec): 4.05 - samples/sec: 5521.62 - lr: 0.000000 - momentum: 0.000000
2023-10-18 14:44:16,275 ----------------------------------------------------------------------------------------------------
2023-10-18 14:44:16,275 EPOCH 10 done: loss 0.3132 - lr: 0.000000
2023-10-18 14:44:16,654 DEV : loss 0.2650297284126282 - f1-score (micro avg) 0.6087
2023-10-18 14:44:16,658 saving best model
2023-10-18 14:44:16,727 ----------------------------------------------------------------------------------------------------
2023-10-18 14:44:16,727 Loading model from best epoch ...
2023-10-18 14:44:16,804 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-18 14:44:17,094
Results:
- F-score (micro) 0.6632
- F-score (macro) 0.3933
- Accuracy 0.5029
By class:
precision recall f1-score support
scope 0.6237 0.6591 0.6409 176
pers 0.8750 0.7656 0.8167 128
work 0.4526 0.5811 0.5089 74
object 0.0000 0.0000 0.0000 2
loc 0.0000 0.0000 0.0000 2
micro avg 0.6539 0.6728 0.6632 382
macro avg 0.3903 0.4012 0.3933 382
weighted avg 0.6682 0.6728 0.6675 382
2023-10-18 14:44:17,094 ----------------------------------------------------------------------------------------------------