stefan-it's picture
Upload folder using huggingface_hub
71a69d7
2023-10-13 10:41:45,041 ----------------------------------------------------------------------------------------------------
2023-10-13 10:41:45,042 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 10:41:45,042 ----------------------------------------------------------------------------------------------------
2023-10-13 10:41:45,042 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-13 10:41:45,042 ----------------------------------------------------------------------------------------------------
2023-10-13 10:41:45,042 Train: 966 sentences
2023-10-13 10:41:45,042 (train_with_dev=False, train_with_test=False)
2023-10-13 10:41:45,042 ----------------------------------------------------------------------------------------------------
2023-10-13 10:41:45,042 Training Params:
2023-10-13 10:41:45,042 - learning_rate: "5e-05"
2023-10-13 10:41:45,042 - mini_batch_size: "4"
2023-10-13 10:41:45,042 - max_epochs: "10"
2023-10-13 10:41:45,042 - shuffle: "True"
2023-10-13 10:41:45,042 ----------------------------------------------------------------------------------------------------
2023-10-13 10:41:45,042 Plugins:
2023-10-13 10:41:45,042 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 10:41:45,042 ----------------------------------------------------------------------------------------------------
2023-10-13 10:41:45,042 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 10:41:45,042 - metric: "('micro avg', 'f1-score')"
2023-10-13 10:41:45,042 ----------------------------------------------------------------------------------------------------
2023-10-13 10:41:45,042 Computation:
2023-10-13 10:41:45,042 - compute on device: cuda:0
2023-10-13 10:41:45,042 - embedding storage: none
2023-10-13 10:41:45,042 ----------------------------------------------------------------------------------------------------
2023-10-13 10:41:45,042 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-13 10:41:45,043 ----------------------------------------------------------------------------------------------------
2023-10-13 10:41:45,043 ----------------------------------------------------------------------------------------------------
2023-10-13 10:41:46,117 epoch 1 - iter 24/242 - loss 3.46156184 - time (sec): 1.07 - samples/sec: 2233.92 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:41:47,199 epoch 1 - iter 48/242 - loss 2.81392470 - time (sec): 2.15 - samples/sec: 2297.46 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:41:48,281 epoch 1 - iter 72/242 - loss 2.17375436 - time (sec): 3.24 - samples/sec: 2160.25 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:41:49,394 epoch 1 - iter 96/242 - loss 1.73004972 - time (sec): 4.35 - samples/sec: 2254.78 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:41:50,474 epoch 1 - iter 120/242 - loss 1.48446200 - time (sec): 5.43 - samples/sec: 2265.26 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:41:51,540 epoch 1 - iter 144/242 - loss 1.30127550 - time (sec): 6.50 - samples/sec: 2274.92 - lr: 0.000030 - momentum: 0.000000
2023-10-13 10:41:52,629 epoch 1 - iter 168/242 - loss 1.17060706 - time (sec): 7.59 - samples/sec: 2266.72 - lr: 0.000035 - momentum: 0.000000
2023-10-13 10:41:53,721 epoch 1 - iter 192/242 - loss 1.05963817 - time (sec): 8.68 - samples/sec: 2279.87 - lr: 0.000039 - momentum: 0.000000
2023-10-13 10:41:54,817 epoch 1 - iter 216/242 - loss 0.97540615 - time (sec): 9.77 - samples/sec: 2273.70 - lr: 0.000044 - momentum: 0.000000
2023-10-13 10:41:55,890 epoch 1 - iter 240/242 - loss 0.90301481 - time (sec): 10.85 - samples/sec: 2276.33 - lr: 0.000049 - momentum: 0.000000
2023-10-13 10:41:55,977 ----------------------------------------------------------------------------------------------------
2023-10-13 10:41:55,978 EPOCH 1 done: loss 0.9024 - lr: 0.000049
2023-10-13 10:41:56,737 DEV : loss 0.25707900524139404 - f1-score (micro avg) 0.544
2023-10-13 10:41:56,742 saving best model
2023-10-13 10:41:57,135 ----------------------------------------------------------------------------------------------------
2023-10-13 10:41:58,193 epoch 2 - iter 24/242 - loss 0.21922578 - time (sec): 1.06 - samples/sec: 2456.38 - lr: 0.000049 - momentum: 0.000000
2023-10-13 10:41:59,254 epoch 2 - iter 48/242 - loss 0.22057656 - time (sec): 2.12 - samples/sec: 2307.40 - lr: 0.000049 - momentum: 0.000000
2023-10-13 10:42:00,316 epoch 2 - iter 72/242 - loss 0.21312233 - time (sec): 3.18 - samples/sec: 2267.79 - lr: 0.000048 - momentum: 0.000000
2023-10-13 10:42:01,400 epoch 2 - iter 96/242 - loss 0.19917169 - time (sec): 4.26 - samples/sec: 2322.77 - lr: 0.000048 - momentum: 0.000000
2023-10-13 10:42:02,530 epoch 2 - iter 120/242 - loss 0.20003692 - time (sec): 5.39 - samples/sec: 2309.10 - lr: 0.000047 - momentum: 0.000000
2023-10-13 10:42:03,603 epoch 2 - iter 144/242 - loss 0.19527815 - time (sec): 6.47 - samples/sec: 2287.22 - lr: 0.000047 - momentum: 0.000000
2023-10-13 10:42:04,663 epoch 2 - iter 168/242 - loss 0.18678743 - time (sec): 7.53 - samples/sec: 2273.83 - lr: 0.000046 - momentum: 0.000000
2023-10-13 10:42:05,745 epoch 2 - iter 192/242 - loss 0.18432895 - time (sec): 8.61 - samples/sec: 2262.33 - lr: 0.000046 - momentum: 0.000000
2023-10-13 10:42:06,822 epoch 2 - iter 216/242 - loss 0.17934945 - time (sec): 9.69 - samples/sec: 2270.62 - lr: 0.000045 - momentum: 0.000000
2023-10-13 10:42:07,916 epoch 2 - iter 240/242 - loss 0.17612460 - time (sec): 10.78 - samples/sec: 2279.81 - lr: 0.000045 - momentum: 0.000000
2023-10-13 10:42:08,001 ----------------------------------------------------------------------------------------------------
2023-10-13 10:42:08,001 EPOCH 2 done: loss 0.1762 - lr: 0.000045
2023-10-13 10:42:08,785 DEV : loss 0.12716606259346008 - f1-score (micro avg) 0.8287
2023-10-13 10:42:08,790 saving best model
2023-10-13 10:42:09,289 ----------------------------------------------------------------------------------------------------
2023-10-13 10:42:10,514 epoch 3 - iter 24/242 - loss 0.10879230 - time (sec): 1.22 - samples/sec: 1873.10 - lr: 0.000044 - momentum: 0.000000
2023-10-13 10:42:11,627 epoch 3 - iter 48/242 - loss 0.10523804 - time (sec): 2.33 - samples/sec: 2013.69 - lr: 0.000043 - momentum: 0.000000
2023-10-13 10:42:12,753 epoch 3 - iter 72/242 - loss 0.10705649 - time (sec): 3.46 - samples/sec: 2066.79 - lr: 0.000043 - momentum: 0.000000
2023-10-13 10:42:13,832 epoch 3 - iter 96/242 - loss 0.10718156 - time (sec): 4.54 - samples/sec: 2092.91 - lr: 0.000042 - momentum: 0.000000
2023-10-13 10:42:14,929 epoch 3 - iter 120/242 - loss 0.10632305 - time (sec): 5.63 - samples/sec: 2075.54 - lr: 0.000042 - momentum: 0.000000
2023-10-13 10:42:16,006 epoch 3 - iter 144/242 - loss 0.11007337 - time (sec): 6.71 - samples/sec: 2155.21 - lr: 0.000041 - momentum: 0.000000
2023-10-13 10:42:17,056 epoch 3 - iter 168/242 - loss 0.10718185 - time (sec): 7.76 - samples/sec: 2161.30 - lr: 0.000041 - momentum: 0.000000
2023-10-13 10:42:18,122 epoch 3 - iter 192/242 - loss 0.10348610 - time (sec): 8.83 - samples/sec: 2205.15 - lr: 0.000040 - momentum: 0.000000
2023-10-13 10:42:19,179 epoch 3 - iter 216/242 - loss 0.10511225 - time (sec): 9.88 - samples/sec: 2210.22 - lr: 0.000040 - momentum: 0.000000
2023-10-13 10:42:20,294 epoch 3 - iter 240/242 - loss 0.10233471 - time (sec): 11.00 - samples/sec: 2232.49 - lr: 0.000039 - momentum: 0.000000
2023-10-13 10:42:20,383 ----------------------------------------------------------------------------------------------------
2023-10-13 10:42:20,383 EPOCH 3 done: loss 0.1022 - lr: 0.000039
2023-10-13 10:42:21,171 DEV : loss 0.13867823779582977 - f1-score (micro avg) 0.8015
2023-10-13 10:42:21,176 ----------------------------------------------------------------------------------------------------
2023-10-13 10:42:22,312 epoch 4 - iter 24/242 - loss 0.05512091 - time (sec): 1.13 - samples/sec: 2246.86 - lr: 0.000038 - momentum: 0.000000
2023-10-13 10:42:23,397 epoch 4 - iter 48/242 - loss 0.06593591 - time (sec): 2.22 - samples/sec: 2387.78 - lr: 0.000038 - momentum: 0.000000
2023-10-13 10:42:24,508 epoch 4 - iter 72/242 - loss 0.06551999 - time (sec): 3.33 - samples/sec: 2303.46 - lr: 0.000037 - momentum: 0.000000
2023-10-13 10:42:25,586 epoch 4 - iter 96/242 - loss 0.07182048 - time (sec): 4.41 - samples/sec: 2301.69 - lr: 0.000037 - momentum: 0.000000
2023-10-13 10:42:26,653 epoch 4 - iter 120/242 - loss 0.07439872 - time (sec): 5.48 - samples/sec: 2283.65 - lr: 0.000036 - momentum: 0.000000
2023-10-13 10:42:27,719 epoch 4 - iter 144/242 - loss 0.07260853 - time (sec): 6.54 - samples/sec: 2258.08 - lr: 0.000036 - momentum: 0.000000
2023-10-13 10:42:28,800 epoch 4 - iter 168/242 - loss 0.07543241 - time (sec): 7.62 - samples/sec: 2264.30 - lr: 0.000035 - momentum: 0.000000
2023-10-13 10:42:29,899 epoch 4 - iter 192/242 - loss 0.07408404 - time (sec): 8.72 - samples/sec: 2279.13 - lr: 0.000035 - momentum: 0.000000
2023-10-13 10:42:30,981 epoch 4 - iter 216/242 - loss 0.07349438 - time (sec): 9.80 - samples/sec: 2262.90 - lr: 0.000034 - momentum: 0.000000
2023-10-13 10:42:32,059 epoch 4 - iter 240/242 - loss 0.07588385 - time (sec): 10.88 - samples/sec: 2260.33 - lr: 0.000033 - momentum: 0.000000
2023-10-13 10:42:32,147 ----------------------------------------------------------------------------------------------------
2023-10-13 10:42:32,147 EPOCH 4 done: loss 0.0754 - lr: 0.000033
2023-10-13 10:42:32,946 DEV : loss 0.15469150245189667 - f1-score (micro avg) 0.8162
2023-10-13 10:42:32,951 ----------------------------------------------------------------------------------------------------
2023-10-13 10:42:34,038 epoch 5 - iter 24/242 - loss 0.06090096 - time (sec): 1.09 - samples/sec: 2255.79 - lr: 0.000033 - momentum: 0.000000
2023-10-13 10:42:35,140 epoch 5 - iter 48/242 - loss 0.04977777 - time (sec): 2.19 - samples/sec: 2335.26 - lr: 0.000032 - momentum: 0.000000
2023-10-13 10:42:36,228 epoch 5 - iter 72/242 - loss 0.05334767 - time (sec): 3.28 - samples/sec: 2293.63 - lr: 0.000032 - momentum: 0.000000
2023-10-13 10:42:37,305 epoch 5 - iter 96/242 - loss 0.05367350 - time (sec): 4.35 - samples/sec: 2328.72 - lr: 0.000031 - momentum: 0.000000
2023-10-13 10:42:38,352 epoch 5 - iter 120/242 - loss 0.05266192 - time (sec): 5.40 - samples/sec: 2297.46 - lr: 0.000031 - momentum: 0.000000
2023-10-13 10:42:39,448 epoch 5 - iter 144/242 - loss 0.05666722 - time (sec): 6.50 - samples/sec: 2284.79 - lr: 0.000030 - momentum: 0.000000
2023-10-13 10:42:40,504 epoch 5 - iter 168/242 - loss 0.05660339 - time (sec): 7.55 - samples/sec: 2281.36 - lr: 0.000030 - momentum: 0.000000
2023-10-13 10:42:41,591 epoch 5 - iter 192/242 - loss 0.05659675 - time (sec): 8.64 - samples/sec: 2274.79 - lr: 0.000029 - momentum: 0.000000
2023-10-13 10:42:42,694 epoch 5 - iter 216/242 - loss 0.05842757 - time (sec): 9.74 - samples/sec: 2289.87 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:42:43,764 epoch 5 - iter 240/242 - loss 0.05711639 - time (sec): 10.81 - samples/sec: 2276.07 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:42:43,851 ----------------------------------------------------------------------------------------------------
2023-10-13 10:42:43,851 EPOCH 5 done: loss 0.0568 - lr: 0.000028
2023-10-13 10:42:44,614 DEV : loss 0.16716918349266052 - f1-score (micro avg) 0.7995
2023-10-13 10:42:44,620 ----------------------------------------------------------------------------------------------------
2023-10-13 10:42:45,679 epoch 6 - iter 24/242 - loss 0.05513007 - time (sec): 1.06 - samples/sec: 2174.08 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:42:46,767 epoch 6 - iter 48/242 - loss 0.03670094 - time (sec): 2.15 - samples/sec: 2181.62 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:42:47,873 epoch 6 - iter 72/242 - loss 0.03594791 - time (sec): 3.25 - samples/sec: 2254.58 - lr: 0.000026 - momentum: 0.000000
2023-10-13 10:42:48,947 epoch 6 - iter 96/242 - loss 0.04310180 - time (sec): 4.33 - samples/sec: 2310.95 - lr: 0.000026 - momentum: 0.000000
2023-10-13 10:42:50,023 epoch 6 - iter 120/242 - loss 0.04266758 - time (sec): 5.40 - samples/sec: 2254.99 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:42:51,148 epoch 6 - iter 144/242 - loss 0.04022931 - time (sec): 6.53 - samples/sec: 2275.02 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:42:52,297 epoch 6 - iter 168/242 - loss 0.03618281 - time (sec): 7.68 - samples/sec: 2272.67 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:42:53,733 epoch 6 - iter 192/242 - loss 0.03647934 - time (sec): 9.11 - samples/sec: 2187.64 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:42:54,863 epoch 6 - iter 216/242 - loss 0.03843713 - time (sec): 10.24 - samples/sec: 2170.19 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:42:55,938 epoch 6 - iter 240/242 - loss 0.03765580 - time (sec): 11.32 - samples/sec: 2174.28 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:42:56,026 ----------------------------------------------------------------------------------------------------
2023-10-13 10:42:56,026 EPOCH 6 done: loss 0.0375 - lr: 0.000022
2023-10-13 10:42:56,813 DEV : loss 0.18939390778541565 - f1-score (micro avg) 0.8275
2023-10-13 10:42:56,818 ----------------------------------------------------------------------------------------------------
2023-10-13 10:42:57,856 epoch 7 - iter 24/242 - loss 0.03697281 - time (sec): 1.04 - samples/sec: 2275.70 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:42:58,926 epoch 7 - iter 48/242 - loss 0.02153311 - time (sec): 2.11 - samples/sec: 2296.40 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:43:00,162 epoch 7 - iter 72/242 - loss 0.02010875 - time (sec): 3.34 - samples/sec: 2130.23 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:43:01,245 epoch 7 - iter 96/242 - loss 0.02098508 - time (sec): 4.43 - samples/sec: 2197.41 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:43:02,337 epoch 7 - iter 120/242 - loss 0.02082330 - time (sec): 5.52 - samples/sec: 2265.18 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:43:03,409 epoch 7 - iter 144/242 - loss 0.02213325 - time (sec): 6.59 - samples/sec: 2265.26 - lr: 0.000019 - momentum: 0.000000
2023-10-13 10:43:04,494 epoch 7 - iter 168/242 - loss 0.02396625 - time (sec): 7.68 - samples/sec: 2258.95 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:43:05,554 epoch 7 - iter 192/242 - loss 0.02550011 - time (sec): 8.74 - samples/sec: 2253.23 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:43:06,598 epoch 7 - iter 216/242 - loss 0.02541429 - time (sec): 9.78 - samples/sec: 2273.01 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:43:07,688 epoch 7 - iter 240/242 - loss 0.02778038 - time (sec): 10.87 - samples/sec: 2261.55 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:43:07,774 ----------------------------------------------------------------------------------------------------
2023-10-13 10:43:07,774 EPOCH 7 done: loss 0.0278 - lr: 0.000017
2023-10-13 10:43:08,526 DEV : loss 0.19671836495399475 - f1-score (micro avg) 0.8241
2023-10-13 10:43:08,531 ----------------------------------------------------------------------------------------------------
2023-10-13 10:43:09,603 epoch 8 - iter 24/242 - loss 0.01145862 - time (sec): 1.07 - samples/sec: 2222.15 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:43:10,672 epoch 8 - iter 48/242 - loss 0.01812603 - time (sec): 2.14 - samples/sec: 2093.84 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:43:11,763 epoch 8 - iter 72/242 - loss 0.01811323 - time (sec): 3.23 - samples/sec: 2144.72 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:43:12,881 epoch 8 - iter 96/242 - loss 0.01947770 - time (sec): 4.35 - samples/sec: 2229.18 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:43:13,935 epoch 8 - iter 120/242 - loss 0.01896046 - time (sec): 5.40 - samples/sec: 2247.94 - lr: 0.000014 - momentum: 0.000000
2023-10-13 10:43:15,003 epoch 8 - iter 144/242 - loss 0.01837430 - time (sec): 6.47 - samples/sec: 2266.90 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:43:16,059 epoch 8 - iter 168/242 - loss 0.02146016 - time (sec): 7.53 - samples/sec: 2255.52 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:43:17,130 epoch 8 - iter 192/242 - loss 0.01955272 - time (sec): 8.60 - samples/sec: 2234.63 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:43:18,239 epoch 8 - iter 216/242 - loss 0.01848544 - time (sec): 9.71 - samples/sec: 2266.63 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:43:19,317 epoch 8 - iter 240/242 - loss 0.01827427 - time (sec): 10.78 - samples/sec: 2282.07 - lr: 0.000011 - momentum: 0.000000
2023-10-13 10:43:19,400 ----------------------------------------------------------------------------------------------------
2023-10-13 10:43:19,400 EPOCH 8 done: loss 0.0182 - lr: 0.000011
2023-10-13 10:43:20,162 DEV : loss 0.20273931324481964 - f1-score (micro avg) 0.8261
2023-10-13 10:43:20,167 ----------------------------------------------------------------------------------------------------
2023-10-13 10:43:21,252 epoch 9 - iter 24/242 - loss 0.00546598 - time (sec): 1.08 - samples/sec: 2246.77 - lr: 0.000011 - momentum: 0.000000
2023-10-13 10:43:22,338 epoch 9 - iter 48/242 - loss 0.01181394 - time (sec): 2.17 - samples/sec: 2255.62 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:43:23,431 epoch 9 - iter 72/242 - loss 0.01261593 - time (sec): 3.26 - samples/sec: 2307.42 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:43:24,499 epoch 9 - iter 96/242 - loss 0.01100124 - time (sec): 4.33 - samples/sec: 2357.36 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:43:25,568 epoch 9 - iter 120/242 - loss 0.01432732 - time (sec): 5.40 - samples/sec: 2315.20 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:43:26,636 epoch 9 - iter 144/242 - loss 0.01455621 - time (sec): 6.47 - samples/sec: 2295.96 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:43:27,726 epoch 9 - iter 168/242 - loss 0.01285498 - time (sec): 7.56 - samples/sec: 2307.38 - lr: 0.000007 - momentum: 0.000000
2023-10-13 10:43:28,804 epoch 9 - iter 192/242 - loss 0.01567029 - time (sec): 8.64 - samples/sec: 2307.20 - lr: 0.000007 - momentum: 0.000000
2023-10-13 10:43:29,902 epoch 9 - iter 216/242 - loss 0.01442631 - time (sec): 9.73 - samples/sec: 2293.73 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:43:31,041 epoch 9 - iter 240/242 - loss 0.01351262 - time (sec): 10.87 - samples/sec: 2266.51 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:43:31,124 ----------------------------------------------------------------------------------------------------
2023-10-13 10:43:31,124 EPOCH 9 done: loss 0.0135 - lr: 0.000006
2023-10-13 10:43:31,926 DEV : loss 0.21898262202739716 - f1-score (micro avg) 0.825
2023-10-13 10:43:31,932 ----------------------------------------------------------------------------------------------------
2023-10-13 10:43:33,089 epoch 10 - iter 24/242 - loss 0.01936874 - time (sec): 1.16 - samples/sec: 2110.07 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:43:34,226 epoch 10 - iter 48/242 - loss 0.01331180 - time (sec): 2.29 - samples/sec: 2245.28 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:43:35,325 epoch 10 - iter 72/242 - loss 0.01101617 - time (sec): 3.39 - samples/sec: 2185.21 - lr: 0.000004 - momentum: 0.000000
2023-10-13 10:43:36,395 epoch 10 - iter 96/242 - loss 0.01008254 - time (sec): 4.46 - samples/sec: 2219.43 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:43:37,525 epoch 10 - iter 120/242 - loss 0.00985497 - time (sec): 5.59 - samples/sec: 2212.71 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:43:38,634 epoch 10 - iter 144/242 - loss 0.00963250 - time (sec): 6.70 - samples/sec: 2224.86 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:43:39,687 epoch 10 - iter 168/242 - loss 0.00900359 - time (sec): 7.75 - samples/sec: 2219.62 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:43:40,735 epoch 10 - iter 192/242 - loss 0.01096737 - time (sec): 8.80 - samples/sec: 2231.01 - lr: 0.000001 - momentum: 0.000000
2023-10-13 10:43:41,819 epoch 10 - iter 216/242 - loss 0.01087552 - time (sec): 9.89 - samples/sec: 2220.60 - lr: 0.000001 - momentum: 0.000000
2023-10-13 10:43:42,916 epoch 10 - iter 240/242 - loss 0.01026021 - time (sec): 10.98 - samples/sec: 2240.75 - lr: 0.000000 - momentum: 0.000000
2023-10-13 10:43:43,001 ----------------------------------------------------------------------------------------------------
2023-10-13 10:43:43,002 EPOCH 10 done: loss 0.0102 - lr: 0.000000
2023-10-13 10:43:43,777 DEV : loss 0.21501176059246063 - f1-score (micro avg) 0.8243
2023-10-13 10:43:44,172 ----------------------------------------------------------------------------------------------------
2023-10-13 10:43:44,174 Loading model from best epoch ...
2023-10-13 10:43:45,810 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-13 10:43:46,651
Results:
- F-score (micro) 0.7553
- F-score (macro) 0.4488
- Accuracy 0.6242
By class:
precision recall f1-score support
pers 0.8264 0.8561 0.8410 139
scope 0.7351 0.8605 0.7929 129
work 0.5567 0.6750 0.6102 80
loc 0.0000 0.0000 0.0000 9
date 0.0000 0.0000 0.0000 3
micro avg 0.7245 0.7889 0.7553 360
macro avg 0.4236 0.4783 0.4488 360
weighted avg 0.7062 0.7889 0.7444 360
2023-10-13 10:43:46,652 ----------------------------------------------------------------------------------------------------