|
2023-10-13 17:47:04,163 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:47:04,164 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 17:47:04,164 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:47:04,164 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences |
|
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator |
|
2023-10-13 17:47:04,164 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:47:04,164 Train: 5901 sentences |
|
2023-10-13 17:47:04,164 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 17:47:04,164 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:47:04,164 Training Params: |
|
2023-10-13 17:47:04,164 - learning_rate: "5e-05" |
|
2023-10-13 17:47:04,164 - mini_batch_size: "8" |
|
2023-10-13 17:47:04,164 - max_epochs: "10" |
|
2023-10-13 17:47:04,164 - shuffle: "True" |
|
2023-10-13 17:47:04,164 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:47:04,164 Plugins: |
|
2023-10-13 17:47:04,164 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 17:47:04,164 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:47:04,164 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 17:47:04,164 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 17:47:04,164 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:47:04,164 Computation: |
|
2023-10-13 17:47:04,164 - compute on device: cuda:0 |
|
2023-10-13 17:47:04,164 - embedding storage: none |
|
2023-10-13 17:47:04,164 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:47:04,164 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-13 17:47:04,164 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:47:04,165 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:47:09,303 epoch 1 - iter 73/738 - loss 2.61704957 - time (sec): 5.14 - samples/sec: 3329.07 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 17:47:14,890 epoch 1 - iter 146/738 - loss 1.63900339 - time (sec): 10.72 - samples/sec: 3355.14 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 17:47:19,561 epoch 1 - iter 219/738 - loss 1.25224248 - time (sec): 15.40 - samples/sec: 3383.08 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 17:47:24,159 epoch 1 - iter 292/738 - loss 1.03925094 - time (sec): 19.99 - samples/sec: 3392.19 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 17:47:28,728 epoch 1 - iter 365/738 - loss 0.89541308 - time (sec): 24.56 - samples/sec: 3399.05 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 17:47:33,630 epoch 1 - iter 438/738 - loss 0.79056877 - time (sec): 29.46 - samples/sec: 3393.34 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 17:47:37,911 epoch 1 - iter 511/738 - loss 0.71979193 - time (sec): 33.75 - samples/sec: 3392.11 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 17:47:42,768 epoch 1 - iter 584/738 - loss 0.65785367 - time (sec): 38.60 - samples/sec: 3381.31 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 17:47:47,675 epoch 1 - iter 657/738 - loss 0.60430136 - time (sec): 43.51 - samples/sec: 3373.75 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-13 17:47:53,061 epoch 1 - iter 730/738 - loss 0.55420001 - time (sec): 48.90 - samples/sec: 3371.78 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 17:47:53,539 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:47:53,540 EPOCH 1 done: loss 0.5507 - lr: 0.000049 |
|
2023-10-13 17:47:59,706 DEV : loss 0.12785974144935608 - f1-score (micro avg) 0.7131 |
|
2023-10-13 17:47:59,734 saving best model |
|
2023-10-13 17:48:00,205 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:48:04,704 epoch 2 - iter 73/738 - loss 0.14031504 - time (sec): 4.50 - samples/sec: 3262.14 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 17:48:09,192 epoch 2 - iter 146/738 - loss 0.13796547 - time (sec): 8.99 - samples/sec: 3313.17 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 17:48:14,464 epoch 2 - iter 219/738 - loss 0.13466397 - time (sec): 14.26 - samples/sec: 3360.71 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 17:48:19,296 epoch 2 - iter 292/738 - loss 0.13225824 - time (sec): 19.09 - samples/sec: 3355.88 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 17:48:24,137 epoch 2 - iter 365/738 - loss 0.13022111 - time (sec): 23.93 - samples/sec: 3350.86 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 17:48:29,160 epoch 2 - iter 438/738 - loss 0.12720644 - time (sec): 28.95 - samples/sec: 3364.48 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 17:48:34,186 epoch 2 - iter 511/738 - loss 0.12457501 - time (sec): 33.98 - samples/sec: 3340.81 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 17:48:38,951 epoch 2 - iter 584/738 - loss 0.12421710 - time (sec): 38.74 - samples/sec: 3349.70 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 17:48:44,401 epoch 2 - iter 657/738 - loss 0.12162126 - time (sec): 44.19 - samples/sec: 3350.95 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 17:48:49,387 epoch 2 - iter 730/738 - loss 0.11931305 - time (sec): 49.18 - samples/sec: 3349.23 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 17:48:49,874 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:48:49,874 EPOCH 2 done: loss 0.1189 - lr: 0.000045 |
|
2023-10-13 17:49:01,090 DEV : loss 0.13197503983974457 - f1-score (micro avg) 0.7308 |
|
2023-10-13 17:49:01,119 saving best model |
|
2023-10-13 17:49:01,598 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:49:06,572 epoch 3 - iter 73/738 - loss 0.07614814 - time (sec): 4.97 - samples/sec: 3274.66 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-13 17:49:11,815 epoch 3 - iter 146/738 - loss 0.07670962 - time (sec): 10.21 - samples/sec: 3311.41 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 17:49:16,339 epoch 3 - iter 219/738 - loss 0.07615619 - time (sec): 14.74 - samples/sec: 3339.95 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 17:49:21,598 epoch 3 - iter 292/738 - loss 0.08533494 - time (sec): 20.00 - samples/sec: 3355.66 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 17:49:26,329 epoch 3 - iter 365/738 - loss 0.08052962 - time (sec): 24.73 - samples/sec: 3350.28 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 17:49:31,257 epoch 3 - iter 438/738 - loss 0.07707434 - time (sec): 29.65 - samples/sec: 3330.19 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 17:49:36,060 epoch 3 - iter 511/738 - loss 0.07639684 - time (sec): 34.46 - samples/sec: 3344.43 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 17:49:41,408 epoch 3 - iter 584/738 - loss 0.07393004 - time (sec): 39.80 - samples/sec: 3333.01 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 17:49:46,311 epoch 3 - iter 657/738 - loss 0.07260392 - time (sec): 44.71 - samples/sec: 3315.95 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 17:49:51,524 epoch 3 - iter 730/738 - loss 0.07247522 - time (sec): 49.92 - samples/sec: 3305.85 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 17:49:51,967 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:49:51,967 EPOCH 3 done: loss 0.0725 - lr: 0.000039 |
|
2023-10-13 17:50:03,343 DEV : loss 0.1486140638589859 - f1-score (micro avg) 0.7833 |
|
2023-10-13 17:50:03,372 saving best model |
|
2023-10-13 17:50:03,852 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:50:09,135 epoch 4 - iter 73/738 - loss 0.05135216 - time (sec): 5.28 - samples/sec: 3383.66 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 17:50:13,787 epoch 4 - iter 146/738 - loss 0.05140785 - time (sec): 9.93 - samples/sec: 3335.68 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 17:50:19,538 epoch 4 - iter 219/738 - loss 0.04837867 - time (sec): 15.68 - samples/sec: 3374.63 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 17:50:24,656 epoch 4 - iter 292/738 - loss 0.05298887 - time (sec): 20.80 - samples/sec: 3349.51 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 17:50:29,288 epoch 4 - iter 365/738 - loss 0.05192526 - time (sec): 25.43 - samples/sec: 3358.42 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 17:50:34,582 epoch 4 - iter 438/738 - loss 0.05297484 - time (sec): 30.72 - samples/sec: 3368.28 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 17:50:39,204 epoch 4 - iter 511/738 - loss 0.05272183 - time (sec): 35.34 - samples/sec: 3368.87 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 17:50:43,882 epoch 4 - iter 584/738 - loss 0.05406746 - time (sec): 40.02 - samples/sec: 3349.33 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 17:50:48,214 epoch 4 - iter 657/738 - loss 0.05441271 - time (sec): 44.35 - samples/sec: 3352.26 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 17:50:52,924 epoch 4 - iter 730/738 - loss 0.05358667 - time (sec): 49.06 - samples/sec: 3359.88 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 17:50:53,379 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:50:53,379 EPOCH 4 done: loss 0.0533 - lr: 0.000033 |
|
2023-10-13 17:51:04,589 DEV : loss 0.1737738847732544 - f1-score (micro avg) 0.8049 |
|
2023-10-13 17:51:04,619 saving best model |
|
2023-10-13 17:51:05,140 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:51:10,235 epoch 5 - iter 73/738 - loss 0.03032376 - time (sec): 5.09 - samples/sec: 3312.13 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 17:51:14,799 epoch 5 - iter 146/738 - loss 0.03581647 - time (sec): 9.65 - samples/sec: 3328.98 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 17:51:19,339 epoch 5 - iter 219/738 - loss 0.03520240 - time (sec): 14.19 - samples/sec: 3390.73 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 17:51:24,352 epoch 5 - iter 292/738 - loss 0.03677044 - time (sec): 19.21 - samples/sec: 3407.81 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-13 17:51:29,378 epoch 5 - iter 365/738 - loss 0.03507287 - time (sec): 24.23 - samples/sec: 3366.45 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-13 17:51:34,241 epoch 5 - iter 438/738 - loss 0.03455833 - time (sec): 29.10 - samples/sec: 3355.44 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 17:51:39,901 epoch 5 - iter 511/738 - loss 0.03549297 - time (sec): 34.76 - samples/sec: 3357.51 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 17:51:44,117 epoch 5 - iter 584/738 - loss 0.03649335 - time (sec): 38.97 - samples/sec: 3377.19 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 17:51:49,172 epoch 5 - iter 657/738 - loss 0.03572356 - time (sec): 44.03 - samples/sec: 3376.14 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 17:51:53,851 epoch 5 - iter 730/738 - loss 0.03593262 - time (sec): 48.71 - samples/sec: 3383.72 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 17:51:54,287 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:51:54,287 EPOCH 5 done: loss 0.0360 - lr: 0.000028 |
|
2023-10-13 17:52:05,499 DEV : loss 0.1812753677368164 - f1-score (micro avg) 0.8177 |
|
2023-10-13 17:52:05,531 saving best model |
|
2023-10-13 17:52:06,113 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:52:11,732 epoch 6 - iter 73/738 - loss 0.01621660 - time (sec): 5.61 - samples/sec: 3000.04 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 17:52:16,775 epoch 6 - iter 146/738 - loss 0.02080977 - time (sec): 10.66 - samples/sec: 3100.09 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 17:52:21,263 epoch 6 - iter 219/738 - loss 0.01876135 - time (sec): 15.14 - samples/sec: 3144.78 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 17:52:25,848 epoch 6 - iter 292/738 - loss 0.02138464 - time (sec): 19.73 - samples/sec: 3183.99 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 17:52:31,015 epoch 6 - iter 365/738 - loss 0.01978816 - time (sec): 24.90 - samples/sec: 3209.75 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 17:52:35,258 epoch 6 - iter 438/738 - loss 0.01938038 - time (sec): 29.14 - samples/sec: 3225.51 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 17:52:40,274 epoch 6 - iter 511/738 - loss 0.01928215 - time (sec): 34.16 - samples/sec: 3254.34 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 17:52:45,579 epoch 6 - iter 584/738 - loss 0.01987479 - time (sec): 39.46 - samples/sec: 3280.26 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 17:52:51,330 epoch 6 - iter 657/738 - loss 0.02178292 - time (sec): 45.21 - samples/sec: 3297.22 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 17:52:56,145 epoch 6 - iter 730/738 - loss 0.02282192 - time (sec): 50.03 - samples/sec: 3300.32 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 17:52:56,552 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:52:56,552 EPOCH 6 done: loss 0.0228 - lr: 0.000022 |
|
2023-10-13 17:53:07,779 DEV : loss 0.21827659010887146 - f1-score (micro avg) 0.7988 |
|
2023-10-13 17:53:07,809 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:53:12,402 epoch 7 - iter 73/738 - loss 0.01465345 - time (sec): 4.59 - samples/sec: 3353.90 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 17:53:16,861 epoch 7 - iter 146/738 - loss 0.01573536 - time (sec): 9.05 - samples/sec: 3298.73 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 17:53:21,855 epoch 7 - iter 219/738 - loss 0.01844721 - time (sec): 14.04 - samples/sec: 3360.00 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 17:53:26,583 epoch 7 - iter 292/738 - loss 0.01755483 - time (sec): 18.77 - samples/sec: 3348.38 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 17:53:31,556 epoch 7 - iter 365/738 - loss 0.01732233 - time (sec): 23.75 - samples/sec: 3348.92 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 17:53:36,455 epoch 7 - iter 438/738 - loss 0.01872450 - time (sec): 28.64 - samples/sec: 3348.83 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 17:53:41,246 epoch 7 - iter 511/738 - loss 0.01801696 - time (sec): 33.44 - samples/sec: 3353.96 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 17:53:46,182 epoch 7 - iter 584/738 - loss 0.01924045 - time (sec): 38.37 - samples/sec: 3350.46 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 17:53:51,818 epoch 7 - iter 657/738 - loss 0.01897246 - time (sec): 44.01 - samples/sec: 3363.15 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 17:53:56,775 epoch 7 - iter 730/738 - loss 0.01878918 - time (sec): 48.96 - samples/sec: 3357.79 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 17:53:57,373 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:53:57,373 EPOCH 7 done: loss 0.0186 - lr: 0.000017 |
|
2023-10-13 17:54:08,578 DEV : loss 0.20159663259983063 - f1-score (micro avg) 0.8255 |
|
2023-10-13 17:54:08,607 saving best model |
|
2023-10-13 17:54:09,182 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:54:14,384 epoch 8 - iter 73/738 - loss 0.00867283 - time (sec): 5.20 - samples/sec: 3376.39 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 17:54:18,924 epoch 8 - iter 146/738 - loss 0.00929264 - time (sec): 9.74 - samples/sec: 3338.85 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 17:54:23,900 epoch 8 - iter 219/738 - loss 0.00988928 - time (sec): 14.71 - samples/sec: 3360.24 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 17:54:28,467 epoch 8 - iter 292/738 - loss 0.01078611 - time (sec): 19.28 - samples/sec: 3357.48 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 17:54:33,531 epoch 8 - iter 365/738 - loss 0.01188262 - time (sec): 24.34 - samples/sec: 3332.34 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 17:54:39,068 epoch 8 - iter 438/738 - loss 0.01168210 - time (sec): 29.88 - samples/sec: 3313.89 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 17:54:43,328 epoch 8 - iter 511/738 - loss 0.01108505 - time (sec): 34.14 - samples/sec: 3334.40 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 17:54:48,539 epoch 8 - iter 584/738 - loss 0.01139473 - time (sec): 39.35 - samples/sec: 3327.81 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 17:54:53,175 epoch 8 - iter 657/738 - loss 0.01059925 - time (sec): 43.99 - samples/sec: 3333.14 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 17:54:58,385 epoch 8 - iter 730/738 - loss 0.01213062 - time (sec): 49.20 - samples/sec: 3351.79 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 17:54:58,847 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:54:58,847 EPOCH 8 done: loss 0.0120 - lr: 0.000011 |
|
2023-10-13 17:55:10,116 DEV : loss 0.2121274471282959 - f1-score (micro avg) 0.8167 |
|
2023-10-13 17:55:10,146 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:55:15,100 epoch 9 - iter 73/738 - loss 0.00785780 - time (sec): 4.95 - samples/sec: 3384.77 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 17:55:20,217 epoch 9 - iter 146/738 - loss 0.00968859 - time (sec): 10.07 - samples/sec: 3323.66 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 17:55:24,538 epoch 9 - iter 219/738 - loss 0.00766936 - time (sec): 14.39 - samples/sec: 3355.87 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 17:55:29,150 epoch 9 - iter 292/738 - loss 0.00776847 - time (sec): 19.00 - samples/sec: 3343.63 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 17:55:34,169 epoch 9 - iter 365/738 - loss 0.00786568 - time (sec): 24.02 - samples/sec: 3303.61 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 17:55:39,513 epoch 9 - iter 438/738 - loss 0.00805319 - time (sec): 29.37 - samples/sec: 3304.31 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 17:55:44,830 epoch 9 - iter 511/738 - loss 0.00739363 - time (sec): 34.68 - samples/sec: 3308.13 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 17:55:49,338 epoch 9 - iter 584/738 - loss 0.00731685 - time (sec): 39.19 - samples/sec: 3324.15 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 17:55:54,064 epoch 9 - iter 657/738 - loss 0.00765869 - time (sec): 43.92 - samples/sec: 3324.65 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 17:55:59,126 epoch 9 - iter 730/738 - loss 0.00759718 - time (sec): 48.98 - samples/sec: 3359.39 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 17:55:59,614 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:55:59,614 EPOCH 9 done: loss 0.0075 - lr: 0.000006 |
|
2023-10-13 17:56:10,875 DEV : loss 0.22374621033668518 - f1-score (micro avg) 0.8242 |
|
2023-10-13 17:56:10,904 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:56:16,195 epoch 10 - iter 73/738 - loss 0.00432633 - time (sec): 5.29 - samples/sec: 3017.62 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 17:56:21,075 epoch 10 - iter 146/738 - loss 0.00341698 - time (sec): 10.17 - samples/sec: 3203.85 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 17:56:25,457 epoch 10 - iter 219/738 - loss 0.00467938 - time (sec): 14.55 - samples/sec: 3261.52 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 17:56:30,710 epoch 10 - iter 292/738 - loss 0.00480875 - time (sec): 19.81 - samples/sec: 3313.14 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 17:56:36,255 epoch 10 - iter 365/738 - loss 0.00575497 - time (sec): 25.35 - samples/sec: 3311.60 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 17:56:40,976 epoch 10 - iter 438/738 - loss 0.00574872 - time (sec): 30.07 - samples/sec: 3312.48 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 17:56:45,946 epoch 10 - iter 511/738 - loss 0.00539595 - time (sec): 35.04 - samples/sec: 3329.58 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 17:56:51,370 epoch 10 - iter 584/738 - loss 0.00518260 - time (sec): 40.47 - samples/sec: 3321.26 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 17:56:56,106 epoch 10 - iter 657/738 - loss 0.00512442 - time (sec): 45.20 - samples/sec: 3321.97 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 17:57:00,525 epoch 10 - iter 730/738 - loss 0.00499521 - time (sec): 49.62 - samples/sec: 3319.82 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 17:57:00,998 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:57:00,999 EPOCH 10 done: loss 0.0049 - lr: 0.000000 |
|
2023-10-13 17:57:12,280 DEV : loss 0.22519326210021973 - f1-score (micro avg) 0.8266 |
|
2023-10-13 17:57:12,310 saving best model |
|
2023-10-13 17:57:13,140 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 17:57:13,141 Loading model from best epoch ... |
|
2023-10-13 17:57:14,542 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod |
|
2023-10-13 17:57:20,591 |
|
Results: |
|
- F-score (micro) 0.8013 |
|
- F-score (macro) 0.7071 |
|
- Accuracy 0.6949 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8622 0.8823 0.8721 858 |
|
pers 0.7549 0.7970 0.7754 537 |
|
org 0.5652 0.5909 0.5778 132 |
|
time 0.5484 0.6296 0.5862 54 |
|
prod 0.7636 0.6885 0.7241 61 |
|
|
|
micro avg 0.7876 0.8155 0.8013 1642 |
|
macro avg 0.6989 0.7177 0.7071 1642 |
|
weighted avg 0.7892 0.8155 0.8019 1642 |
|
|
|
2023-10-13 17:57:20,591 ---------------------------------------------------------------------------------------------------- |
|
|