stefan-it's picture
Upload folder using huggingface_hub
7d699c3
2023-10-13 15:56:41,973 ----------------------------------------------------------------------------------------------------
2023-10-13 15:56:41,974 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 15:56:41,975 ----------------------------------------------------------------------------------------------------
2023-10-13 15:56:41,975 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-13 15:56:41,975 ----------------------------------------------------------------------------------------------------
2023-10-13 15:56:41,975 Train: 5901 sentences
2023-10-13 15:56:41,975 (train_with_dev=False, train_with_test=False)
2023-10-13 15:56:41,975 ----------------------------------------------------------------------------------------------------
2023-10-13 15:56:41,975 Training Params:
2023-10-13 15:56:41,975 - learning_rate: "3e-05"
2023-10-13 15:56:41,975 - mini_batch_size: "8"
2023-10-13 15:56:41,975 - max_epochs: "10"
2023-10-13 15:56:41,975 - shuffle: "True"
2023-10-13 15:56:41,975 ----------------------------------------------------------------------------------------------------
2023-10-13 15:56:41,975 Plugins:
2023-10-13 15:56:41,975 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 15:56:41,975 ----------------------------------------------------------------------------------------------------
2023-10-13 15:56:41,975 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 15:56:41,975 - metric: "('micro avg', 'f1-score')"
2023-10-13 15:56:41,975 ----------------------------------------------------------------------------------------------------
2023-10-13 15:56:41,975 Computation:
2023-10-13 15:56:41,975 - compute on device: cuda:0
2023-10-13 15:56:41,975 - embedding storage: none
2023-10-13 15:56:41,975 ----------------------------------------------------------------------------------------------------
2023-10-13 15:56:41,975 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-13 15:56:41,975 ----------------------------------------------------------------------------------------------------
2023-10-13 15:56:41,975 ----------------------------------------------------------------------------------------------------
2023-10-13 15:56:47,173 epoch 1 - iter 73/738 - loss 3.05853360 - time (sec): 5.20 - samples/sec: 3385.89 - lr: 0.000003 - momentum: 0.000000
2023-10-13 15:56:52,161 epoch 1 - iter 146/738 - loss 2.01396427 - time (sec): 10.18 - samples/sec: 3514.39 - lr: 0.000006 - momentum: 0.000000
2023-10-13 15:56:56,988 epoch 1 - iter 219/738 - loss 1.53824384 - time (sec): 15.01 - samples/sec: 3444.77 - lr: 0.000009 - momentum: 0.000000
2023-10-13 15:57:01,882 epoch 1 - iter 292/738 - loss 1.24891774 - time (sec): 19.91 - samples/sec: 3432.48 - lr: 0.000012 - momentum: 0.000000
2023-10-13 15:57:06,840 epoch 1 - iter 365/738 - loss 1.07452913 - time (sec): 24.86 - samples/sec: 3418.88 - lr: 0.000015 - momentum: 0.000000
2023-10-13 15:57:11,658 epoch 1 - iter 438/738 - loss 0.94521822 - time (sec): 29.68 - samples/sec: 3426.34 - lr: 0.000018 - momentum: 0.000000
2023-10-13 15:57:16,509 epoch 1 - iter 511/738 - loss 0.85158135 - time (sec): 34.53 - samples/sec: 3412.22 - lr: 0.000021 - momentum: 0.000000
2023-10-13 15:57:20,999 epoch 1 - iter 584/738 - loss 0.78332215 - time (sec): 39.02 - samples/sec: 3392.42 - lr: 0.000024 - momentum: 0.000000
2023-10-13 15:57:25,830 epoch 1 - iter 657/738 - loss 0.72187463 - time (sec): 43.85 - samples/sec: 3388.05 - lr: 0.000027 - momentum: 0.000000
2023-10-13 15:57:30,631 epoch 1 - iter 730/738 - loss 0.66940959 - time (sec): 48.65 - samples/sec: 3389.93 - lr: 0.000030 - momentum: 0.000000
2023-10-13 15:57:31,100 ----------------------------------------------------------------------------------------------------
2023-10-13 15:57:31,100 EPOCH 1 done: loss 0.6654 - lr: 0.000030
2023-10-13 15:57:37,172 DEV : loss 0.14468906819820404 - f1-score (micro avg) 0.7202
2023-10-13 15:57:37,201 saving best model
2023-10-13 15:57:37,657 ----------------------------------------------------------------------------------------------------
2023-10-13 15:57:41,990 epoch 2 - iter 73/738 - loss 0.14759708 - time (sec): 4.33 - samples/sec: 3490.78 - lr: 0.000030 - momentum: 0.000000
2023-10-13 15:57:46,742 epoch 2 - iter 146/738 - loss 0.14672610 - time (sec): 9.08 - samples/sec: 3447.45 - lr: 0.000029 - momentum: 0.000000
2023-10-13 15:57:51,446 epoch 2 - iter 219/738 - loss 0.14523852 - time (sec): 13.79 - samples/sec: 3442.51 - lr: 0.000029 - momentum: 0.000000
2023-10-13 15:57:56,392 epoch 2 - iter 292/738 - loss 0.14107499 - time (sec): 18.73 - samples/sec: 3369.54 - lr: 0.000029 - momentum: 0.000000
2023-10-13 15:58:01,243 epoch 2 - iter 365/738 - loss 0.14216356 - time (sec): 23.58 - samples/sec: 3333.33 - lr: 0.000028 - momentum: 0.000000
2023-10-13 15:58:06,232 epoch 2 - iter 438/738 - loss 0.13781464 - time (sec): 28.57 - samples/sec: 3340.70 - lr: 0.000028 - momentum: 0.000000
2023-10-13 15:58:11,636 epoch 2 - iter 511/738 - loss 0.13577481 - time (sec): 33.98 - samples/sec: 3350.22 - lr: 0.000028 - momentum: 0.000000
2023-10-13 15:58:16,471 epoch 2 - iter 584/738 - loss 0.13005589 - time (sec): 38.81 - samples/sec: 3352.85 - lr: 0.000027 - momentum: 0.000000
2023-10-13 15:58:21,461 epoch 2 - iter 657/738 - loss 0.12956762 - time (sec): 43.80 - samples/sec: 3362.65 - lr: 0.000027 - momentum: 0.000000
2023-10-13 15:58:26,771 epoch 2 - iter 730/738 - loss 0.12951791 - time (sec): 49.11 - samples/sec: 3353.97 - lr: 0.000027 - momentum: 0.000000
2023-10-13 15:58:27,274 ----------------------------------------------------------------------------------------------------
2023-10-13 15:58:27,274 EPOCH 2 done: loss 0.1294 - lr: 0.000027
2023-10-13 15:58:38,435 DEV : loss 0.11062650382518768 - f1-score (micro avg) 0.7675
2023-10-13 15:58:38,464 saving best model
2023-10-13 15:58:39,090 ----------------------------------------------------------------------------------------------------
2023-10-13 15:58:43,863 epoch 3 - iter 73/738 - loss 0.06065277 - time (sec): 4.77 - samples/sec: 3236.91 - lr: 0.000026 - momentum: 0.000000
2023-10-13 15:58:48,700 epoch 3 - iter 146/738 - loss 0.07316894 - time (sec): 9.61 - samples/sec: 3338.82 - lr: 0.000026 - momentum: 0.000000
2023-10-13 15:58:54,048 epoch 3 - iter 219/738 - loss 0.07863303 - time (sec): 14.95 - samples/sec: 3233.00 - lr: 0.000026 - momentum: 0.000000
2023-10-13 15:58:58,383 epoch 3 - iter 292/738 - loss 0.07743735 - time (sec): 19.29 - samples/sec: 3281.31 - lr: 0.000025 - momentum: 0.000000
2023-10-13 15:59:03,909 epoch 3 - iter 365/738 - loss 0.07606154 - time (sec): 24.81 - samples/sec: 3262.96 - lr: 0.000025 - momentum: 0.000000
2023-10-13 15:59:09,129 epoch 3 - iter 438/738 - loss 0.07463572 - time (sec): 30.03 - samples/sec: 3309.18 - lr: 0.000025 - momentum: 0.000000
2023-10-13 15:59:13,987 epoch 3 - iter 511/738 - loss 0.07177934 - time (sec): 34.89 - samples/sec: 3306.95 - lr: 0.000024 - momentum: 0.000000
2023-10-13 15:59:18,942 epoch 3 - iter 584/738 - loss 0.07324880 - time (sec): 39.85 - samples/sec: 3320.00 - lr: 0.000024 - momentum: 0.000000
2023-10-13 15:59:24,146 epoch 3 - iter 657/738 - loss 0.07181062 - time (sec): 45.05 - samples/sec: 3315.65 - lr: 0.000024 - momentum: 0.000000
2023-10-13 15:59:28,957 epoch 3 - iter 730/738 - loss 0.07338875 - time (sec): 49.86 - samples/sec: 3305.07 - lr: 0.000023 - momentum: 0.000000
2023-10-13 15:59:29,439 ----------------------------------------------------------------------------------------------------
2023-10-13 15:59:29,439 EPOCH 3 done: loss 0.0738 - lr: 0.000023
2023-10-13 15:59:40,672 DEV : loss 0.10864270478487015 - f1-score (micro avg) 0.8175
2023-10-13 15:59:40,703 saving best model
2023-10-13 15:59:41,239 ----------------------------------------------------------------------------------------------------
2023-10-13 15:59:46,028 epoch 4 - iter 73/738 - loss 0.04491633 - time (sec): 4.79 - samples/sec: 3162.59 - lr: 0.000023 - momentum: 0.000000
2023-10-13 15:59:50,685 epoch 4 - iter 146/738 - loss 0.04351311 - time (sec): 9.44 - samples/sec: 3264.79 - lr: 0.000023 - momentum: 0.000000
2023-10-13 15:59:55,499 epoch 4 - iter 219/738 - loss 0.04436716 - time (sec): 14.26 - samples/sec: 3321.07 - lr: 0.000022 - momentum: 0.000000
2023-10-13 16:00:00,085 epoch 4 - iter 292/738 - loss 0.04559344 - time (sec): 18.84 - samples/sec: 3331.80 - lr: 0.000022 - momentum: 0.000000
2023-10-13 16:00:05,124 epoch 4 - iter 365/738 - loss 0.04601732 - time (sec): 23.88 - samples/sec: 3329.15 - lr: 0.000022 - momentum: 0.000000
2023-10-13 16:00:10,439 epoch 4 - iter 438/738 - loss 0.04450045 - time (sec): 29.20 - samples/sec: 3316.64 - lr: 0.000021 - momentum: 0.000000
2023-10-13 16:00:16,073 epoch 4 - iter 511/738 - loss 0.04465368 - time (sec): 34.83 - samples/sec: 3316.58 - lr: 0.000021 - momentum: 0.000000
2023-10-13 16:00:20,833 epoch 4 - iter 584/738 - loss 0.04598766 - time (sec): 39.59 - samples/sec: 3332.91 - lr: 0.000021 - momentum: 0.000000
2023-10-13 16:00:25,931 epoch 4 - iter 657/738 - loss 0.05017371 - time (sec): 44.69 - samples/sec: 3328.90 - lr: 0.000020 - momentum: 0.000000
2023-10-13 16:00:30,598 epoch 4 - iter 730/738 - loss 0.04880929 - time (sec): 49.36 - samples/sec: 3338.99 - lr: 0.000020 - momentum: 0.000000
2023-10-13 16:00:31,086 ----------------------------------------------------------------------------------------------------
2023-10-13 16:00:31,086 EPOCH 4 done: loss 0.0490 - lr: 0.000020
2023-10-13 16:00:42,240 DEV : loss 0.1474699079990387 - f1-score (micro avg) 0.7874
2023-10-13 16:00:42,273 ----------------------------------------------------------------------------------------------------
2023-10-13 16:00:47,244 epoch 5 - iter 73/738 - loss 0.04483958 - time (sec): 4.97 - samples/sec: 3309.95 - lr: 0.000020 - momentum: 0.000000
2023-10-13 16:00:52,261 epoch 5 - iter 146/738 - loss 0.03940692 - time (sec): 9.99 - samples/sec: 3306.44 - lr: 0.000019 - momentum: 0.000000
2023-10-13 16:00:57,081 epoch 5 - iter 219/738 - loss 0.03823729 - time (sec): 14.81 - samples/sec: 3372.73 - lr: 0.000019 - momentum: 0.000000
2023-10-13 16:01:01,852 epoch 5 - iter 292/738 - loss 0.03570809 - time (sec): 19.58 - samples/sec: 3372.43 - lr: 0.000019 - momentum: 0.000000
2023-10-13 16:01:06,660 epoch 5 - iter 365/738 - loss 0.03611074 - time (sec): 24.39 - samples/sec: 3366.09 - lr: 0.000018 - momentum: 0.000000
2023-10-13 16:01:11,422 epoch 5 - iter 438/738 - loss 0.03549566 - time (sec): 29.15 - samples/sec: 3345.91 - lr: 0.000018 - momentum: 0.000000
2023-10-13 16:01:16,434 epoch 5 - iter 511/738 - loss 0.03481242 - time (sec): 34.16 - samples/sec: 3329.04 - lr: 0.000018 - momentum: 0.000000
2023-10-13 16:01:21,988 epoch 5 - iter 584/738 - loss 0.03612795 - time (sec): 39.71 - samples/sec: 3312.86 - lr: 0.000017 - momentum: 0.000000
2023-10-13 16:01:27,611 epoch 5 - iter 657/738 - loss 0.03541289 - time (sec): 45.34 - samples/sec: 3304.32 - lr: 0.000017 - momentum: 0.000000
2023-10-13 16:01:32,079 epoch 5 - iter 730/738 - loss 0.03603975 - time (sec): 49.80 - samples/sec: 3305.92 - lr: 0.000017 - momentum: 0.000000
2023-10-13 16:01:32,629 ----------------------------------------------------------------------------------------------------
2023-10-13 16:01:32,630 EPOCH 5 done: loss 0.0359 - lr: 0.000017
2023-10-13 16:01:43,734 DEV : loss 0.15243035554885864 - f1-score (micro avg) 0.8137
2023-10-13 16:01:43,764 ----------------------------------------------------------------------------------------------------
2023-10-13 16:01:48,291 epoch 6 - iter 73/738 - loss 0.02753601 - time (sec): 4.53 - samples/sec: 3278.39 - lr: 0.000016 - momentum: 0.000000
2023-10-13 16:01:53,834 epoch 6 - iter 146/738 - loss 0.02412278 - time (sec): 10.07 - samples/sec: 3370.68 - lr: 0.000016 - momentum: 0.000000
2023-10-13 16:01:58,832 epoch 6 - iter 219/738 - loss 0.02730955 - time (sec): 15.07 - samples/sec: 3372.32 - lr: 0.000016 - momentum: 0.000000
2023-10-13 16:02:03,512 epoch 6 - iter 292/738 - loss 0.02914052 - time (sec): 19.75 - samples/sec: 3355.51 - lr: 0.000015 - momentum: 0.000000
2023-10-13 16:02:08,994 epoch 6 - iter 365/738 - loss 0.02963576 - time (sec): 25.23 - samples/sec: 3347.08 - lr: 0.000015 - momentum: 0.000000
2023-10-13 16:02:13,940 epoch 6 - iter 438/738 - loss 0.03069200 - time (sec): 30.18 - samples/sec: 3357.87 - lr: 0.000015 - momentum: 0.000000
2023-10-13 16:02:18,409 epoch 6 - iter 511/738 - loss 0.02902534 - time (sec): 34.64 - samples/sec: 3367.48 - lr: 0.000014 - momentum: 0.000000
2023-10-13 16:02:23,343 epoch 6 - iter 584/738 - loss 0.02744013 - time (sec): 39.58 - samples/sec: 3365.62 - lr: 0.000014 - momentum: 0.000000
2023-10-13 16:02:28,153 epoch 6 - iter 657/738 - loss 0.02726539 - time (sec): 44.39 - samples/sec: 3353.62 - lr: 0.000014 - momentum: 0.000000
2023-10-13 16:02:33,016 epoch 6 - iter 730/738 - loss 0.02719915 - time (sec): 49.25 - samples/sec: 3346.24 - lr: 0.000013 - momentum: 0.000000
2023-10-13 16:02:33,486 ----------------------------------------------------------------------------------------------------
2023-10-13 16:02:33,487 EPOCH 6 done: loss 0.0272 - lr: 0.000013
2023-10-13 16:02:44,645 DEV : loss 0.17250441014766693 - f1-score (micro avg) 0.8204
2023-10-13 16:02:44,676 saving best model
2023-10-13 16:02:45,316 ----------------------------------------------------------------------------------------------------
2023-10-13 16:02:50,958 epoch 7 - iter 73/738 - loss 0.01634149 - time (sec): 5.64 - samples/sec: 3018.98 - lr: 0.000013 - momentum: 0.000000
2023-10-13 16:02:55,302 epoch 7 - iter 146/738 - loss 0.01514888 - time (sec): 9.98 - samples/sec: 3189.06 - lr: 0.000013 - momentum: 0.000000
2023-10-13 16:03:00,715 epoch 7 - iter 219/738 - loss 0.01838780 - time (sec): 15.40 - samples/sec: 3270.41 - lr: 0.000012 - momentum: 0.000000
2023-10-13 16:03:06,168 epoch 7 - iter 292/738 - loss 0.01756859 - time (sec): 20.85 - samples/sec: 3303.19 - lr: 0.000012 - momentum: 0.000000
2023-10-13 16:03:10,622 epoch 7 - iter 365/738 - loss 0.01866420 - time (sec): 25.30 - samples/sec: 3307.34 - lr: 0.000012 - momentum: 0.000000
2023-10-13 16:03:15,201 epoch 7 - iter 438/738 - loss 0.01851098 - time (sec): 29.88 - samples/sec: 3315.07 - lr: 0.000011 - momentum: 0.000000
2023-10-13 16:03:19,804 epoch 7 - iter 511/738 - loss 0.01926756 - time (sec): 34.48 - samples/sec: 3336.22 - lr: 0.000011 - momentum: 0.000000
2023-10-13 16:03:24,445 epoch 7 - iter 584/738 - loss 0.01977881 - time (sec): 39.13 - samples/sec: 3335.09 - lr: 0.000011 - momentum: 0.000000
2023-10-13 16:03:29,619 epoch 7 - iter 657/738 - loss 0.01907489 - time (sec): 44.30 - samples/sec: 3305.95 - lr: 0.000010 - momentum: 0.000000
2023-10-13 16:03:35,450 epoch 7 - iter 730/738 - loss 0.01894560 - time (sec): 50.13 - samples/sec: 3288.28 - lr: 0.000010 - momentum: 0.000000
2023-10-13 16:03:35,947 ----------------------------------------------------------------------------------------------------
2023-10-13 16:03:35,948 EPOCH 7 done: loss 0.0190 - lr: 0.000010
2023-10-13 16:03:47,105 DEV : loss 0.20412878692150116 - f1-score (micro avg) 0.8154
2023-10-13 16:03:47,134 ----------------------------------------------------------------------------------------------------
2023-10-13 16:03:52,161 epoch 8 - iter 73/738 - loss 0.01344325 - time (sec): 5.03 - samples/sec: 3320.25 - lr: 0.000010 - momentum: 0.000000
2023-10-13 16:03:57,698 epoch 8 - iter 146/738 - loss 0.01748030 - time (sec): 10.56 - samples/sec: 3185.06 - lr: 0.000009 - momentum: 0.000000
2023-10-13 16:04:04,280 epoch 8 - iter 219/738 - loss 0.01866343 - time (sec): 17.14 - samples/sec: 3081.13 - lr: 0.000009 - momentum: 0.000000
2023-10-13 16:04:09,445 epoch 8 - iter 292/738 - loss 0.01990917 - time (sec): 22.31 - samples/sec: 3033.63 - lr: 0.000009 - momentum: 0.000000
2023-10-13 16:04:14,015 epoch 8 - iter 365/738 - loss 0.01831245 - time (sec): 26.88 - samples/sec: 3065.14 - lr: 0.000008 - momentum: 0.000000
2023-10-13 16:04:18,968 epoch 8 - iter 438/738 - loss 0.01923268 - time (sec): 31.83 - samples/sec: 3095.99 - lr: 0.000008 - momentum: 0.000000
2023-10-13 16:04:23,888 epoch 8 - iter 511/738 - loss 0.01795623 - time (sec): 36.75 - samples/sec: 3121.73 - lr: 0.000008 - momentum: 0.000000
2023-10-13 16:04:28,303 epoch 8 - iter 584/738 - loss 0.01764510 - time (sec): 41.17 - samples/sec: 3142.12 - lr: 0.000007 - momentum: 0.000000
2023-10-13 16:04:33,109 epoch 8 - iter 657/738 - loss 0.01668878 - time (sec): 45.97 - samples/sec: 3159.38 - lr: 0.000007 - momentum: 0.000000
2023-10-13 16:04:38,749 epoch 8 - iter 730/738 - loss 0.01600711 - time (sec): 51.61 - samples/sec: 3191.33 - lr: 0.000007 - momentum: 0.000000
2023-10-13 16:04:39,252 ----------------------------------------------------------------------------------------------------
2023-10-13 16:04:39,252 EPOCH 8 done: loss 0.0158 - lr: 0.000007
2023-10-13 16:04:50,416 DEV : loss 0.19149629771709442 - f1-score (micro avg) 0.826
2023-10-13 16:04:50,445 saving best model
2023-10-13 16:04:50,981 ----------------------------------------------------------------------------------------------------
2023-10-13 16:04:55,805 epoch 9 - iter 73/738 - loss 0.00698935 - time (sec): 4.82 - samples/sec: 3214.71 - lr: 0.000006 - momentum: 0.000000
2023-10-13 16:05:00,887 epoch 9 - iter 146/738 - loss 0.00952456 - time (sec): 9.91 - samples/sec: 3262.85 - lr: 0.000006 - momentum: 0.000000
2023-10-13 16:05:05,508 epoch 9 - iter 219/738 - loss 0.00949156 - time (sec): 14.53 - samples/sec: 3325.48 - lr: 0.000006 - momentum: 0.000000
2023-10-13 16:05:10,381 epoch 9 - iter 292/738 - loss 0.01022930 - time (sec): 19.40 - samples/sec: 3341.20 - lr: 0.000005 - momentum: 0.000000
2023-10-13 16:05:15,595 epoch 9 - iter 365/738 - loss 0.01120093 - time (sec): 24.61 - samples/sec: 3363.39 - lr: 0.000005 - momentum: 0.000000
2023-10-13 16:05:20,184 epoch 9 - iter 438/738 - loss 0.01043318 - time (sec): 29.20 - samples/sec: 3365.10 - lr: 0.000005 - momentum: 0.000000
2023-10-13 16:05:25,342 epoch 9 - iter 511/738 - loss 0.01017895 - time (sec): 34.36 - samples/sec: 3352.34 - lr: 0.000004 - momentum: 0.000000
2023-10-13 16:05:29,969 epoch 9 - iter 584/738 - loss 0.01017742 - time (sec): 38.99 - samples/sec: 3339.96 - lr: 0.000004 - momentum: 0.000000
2023-10-13 16:05:34,734 epoch 9 - iter 657/738 - loss 0.00985239 - time (sec): 43.75 - samples/sec: 3352.03 - lr: 0.000004 - momentum: 0.000000
2023-10-13 16:05:40,086 epoch 9 - iter 730/738 - loss 0.01123444 - time (sec): 49.10 - samples/sec: 3352.47 - lr: 0.000003 - momentum: 0.000000
2023-10-13 16:05:40,697 ----------------------------------------------------------------------------------------------------
2023-10-13 16:05:40,698 EPOCH 9 done: loss 0.0117 - lr: 0.000003
2023-10-13 16:05:51,786 DEV : loss 0.19480924308300018 - f1-score (micro avg) 0.8267
2023-10-13 16:05:51,816 saving best model
2023-10-13 16:05:52,417 ----------------------------------------------------------------------------------------------------
2023-10-13 16:05:57,660 epoch 10 - iter 73/738 - loss 0.01186703 - time (sec): 5.24 - samples/sec: 3361.08 - lr: 0.000003 - momentum: 0.000000
2023-10-13 16:06:02,305 epoch 10 - iter 146/738 - loss 0.00836274 - time (sec): 9.89 - samples/sec: 3390.77 - lr: 0.000003 - momentum: 0.000000
2023-10-13 16:06:06,585 epoch 10 - iter 219/738 - loss 0.01000260 - time (sec): 14.17 - samples/sec: 3456.57 - lr: 0.000002 - momentum: 0.000000
2023-10-13 16:06:11,468 epoch 10 - iter 292/738 - loss 0.00903905 - time (sec): 19.05 - samples/sec: 3427.99 - lr: 0.000002 - momentum: 0.000000
2023-10-13 16:06:16,389 epoch 10 - iter 365/738 - loss 0.00871698 - time (sec): 23.97 - samples/sec: 3393.43 - lr: 0.000002 - momentum: 0.000000
2023-10-13 16:06:21,868 epoch 10 - iter 438/738 - loss 0.00896638 - time (sec): 29.45 - samples/sec: 3398.02 - lr: 0.000001 - momentum: 0.000000
2023-10-13 16:06:26,485 epoch 10 - iter 511/738 - loss 0.00840529 - time (sec): 34.07 - samples/sec: 3368.60 - lr: 0.000001 - momentum: 0.000000
2023-10-13 16:06:32,017 epoch 10 - iter 584/738 - loss 0.00852755 - time (sec): 39.60 - samples/sec: 3327.66 - lr: 0.000001 - momentum: 0.000000
2023-10-13 16:06:37,311 epoch 10 - iter 657/738 - loss 0.00884580 - time (sec): 44.89 - samples/sec: 3302.90 - lr: 0.000000 - momentum: 0.000000
2023-10-13 16:06:42,635 epoch 10 - iter 730/738 - loss 0.00832851 - time (sec): 50.22 - samples/sec: 3284.25 - lr: 0.000000 - momentum: 0.000000
2023-10-13 16:06:43,054 ----------------------------------------------------------------------------------------------------
2023-10-13 16:06:43,054 EPOCH 10 done: loss 0.0083 - lr: 0.000000
2023-10-13 16:06:54,526 DEV : loss 0.19598710536956787 - f1-score (micro avg) 0.8274
2023-10-13 16:06:54,557 saving best model
2023-10-13 16:06:55,518 ----------------------------------------------------------------------------------------------------
2023-10-13 16:06:55,520 Loading model from best epoch ...
2023-10-13 16:06:57,087 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-13 16:07:03,859
Results:
- F-score (micro) 0.7929
- F-score (macro) 0.6926
- Accuracy 0.6785
By class:
precision recall f1-score support
loc 0.8673 0.8683 0.8678 858
pers 0.7402 0.8119 0.7744 537
org 0.5067 0.5758 0.5390 132
prod 0.6885 0.6885 0.6885 61
time 0.5469 0.6481 0.5932 54
micro avg 0.7742 0.8124 0.7929 1642
macro avg 0.6699 0.7185 0.6926 1642
weighted avg 0.7796 0.8124 0.7951 1642
2023-10-13 16:07:03,859 ----------------------------------------------------------------------------------------------------