stefan-it's picture
Upload folder using huggingface_hub
80fb348
raw
history blame contribute delete
No virus
25 kB
2023-10-08 18:52:59,724 ----------------------------------------------------------------------------------------------------
2023-10-08 18:52:59,725 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-08 18:52:59,725 ----------------------------------------------------------------------------------------------------
2023-10-08 18:52:59,726 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-08 18:52:59,726 ----------------------------------------------------------------------------------------------------
2023-10-08 18:52:59,726 Train: 966 sentences
2023-10-08 18:52:59,726 (train_with_dev=False, train_with_test=False)
2023-10-08 18:52:59,726 ----------------------------------------------------------------------------------------------------
2023-10-08 18:52:59,726 Training Params:
2023-10-08 18:52:59,726 - learning_rate: "0.00015"
2023-10-08 18:52:59,726 - mini_batch_size: "8"
2023-10-08 18:52:59,726 - max_epochs: "10"
2023-10-08 18:52:59,726 - shuffle: "True"
2023-10-08 18:52:59,726 ----------------------------------------------------------------------------------------------------
2023-10-08 18:52:59,726 Plugins:
2023-10-08 18:52:59,726 - TensorboardLogger
2023-10-08 18:52:59,726 - LinearScheduler | warmup_fraction: '0.1'
2023-10-08 18:52:59,726 ----------------------------------------------------------------------------------------------------
2023-10-08 18:52:59,726 Final evaluation on model from best epoch (best-model.pt)
2023-10-08 18:52:59,726 - metric: "('micro avg', 'f1-score')"
2023-10-08 18:52:59,726 ----------------------------------------------------------------------------------------------------
2023-10-08 18:52:59,727 Computation:
2023-10-08 18:52:59,727 - compute on device: cuda:0
2023-10-08 18:52:59,727 - embedding storage: none
2023-10-08 18:52:59,727 ----------------------------------------------------------------------------------------------------
2023-10-08 18:52:59,727 Model training base path: "hmbench-ajmc/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-1"
2023-10-08 18:52:59,727 ----------------------------------------------------------------------------------------------------
2023-10-08 18:52:59,727 ----------------------------------------------------------------------------------------------------
2023-10-08 18:52:59,727 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-08 18:53:07,754 epoch 1 - iter 12/121 - loss 3.23519832 - time (sec): 8.03 - samples/sec: 272.73 - lr: 0.000014 - momentum: 0.000000
2023-10-08 18:53:16,381 epoch 1 - iter 24/121 - loss 3.22887473 - time (sec): 16.65 - samples/sec: 278.94 - lr: 0.000029 - momentum: 0.000000
2023-10-08 18:53:24,624 epoch 1 - iter 36/121 - loss 3.21890088 - time (sec): 24.90 - samples/sec: 278.40 - lr: 0.000043 - momentum: 0.000000
2023-10-08 18:53:33,141 epoch 1 - iter 48/121 - loss 3.20117517 - time (sec): 33.41 - samples/sec: 282.29 - lr: 0.000058 - momentum: 0.000000
2023-10-08 18:53:41,751 epoch 1 - iter 60/121 - loss 3.17067860 - time (sec): 42.02 - samples/sec: 280.21 - lr: 0.000073 - momentum: 0.000000
2023-10-08 18:53:50,982 epoch 1 - iter 72/121 - loss 3.11416165 - time (sec): 51.25 - samples/sec: 281.23 - lr: 0.000088 - momentum: 0.000000
2023-10-08 18:53:59,324 epoch 1 - iter 84/121 - loss 3.05217750 - time (sec): 59.60 - samples/sec: 280.88 - lr: 0.000103 - momentum: 0.000000
2023-10-08 18:54:07,586 epoch 1 - iter 96/121 - loss 2.98099240 - time (sec): 67.86 - samples/sec: 280.50 - lr: 0.000118 - momentum: 0.000000
2023-10-08 18:54:16,787 epoch 1 - iter 108/121 - loss 2.89062479 - time (sec): 77.06 - samples/sec: 282.07 - lr: 0.000133 - momentum: 0.000000
2023-10-08 18:54:26,163 epoch 1 - iter 120/121 - loss 2.79338928 - time (sec): 86.43 - samples/sec: 283.60 - lr: 0.000148 - momentum: 0.000000
2023-10-08 18:54:26,835 ----------------------------------------------------------------------------------------------------
2023-10-08 18:54:26,835 EPOCH 1 done: loss 2.7858 - lr: 0.000148
2023-10-08 18:54:32,493 DEV : loss 1.870829701423645 - f1-score (micro avg) 0.0
2023-10-08 18:54:32,499 ----------------------------------------------------------------------------------------------------
2023-10-08 18:54:40,953 epoch 2 - iter 12/121 - loss 1.87382708 - time (sec): 8.45 - samples/sec: 268.68 - lr: 0.000148 - momentum: 0.000000
2023-10-08 18:54:49,428 epoch 2 - iter 24/121 - loss 1.75718116 - time (sec): 16.93 - samples/sec: 275.58 - lr: 0.000147 - momentum: 0.000000
2023-10-08 18:54:57,823 epoch 2 - iter 36/121 - loss 1.65406239 - time (sec): 25.32 - samples/sec: 281.05 - lr: 0.000145 - momentum: 0.000000
2023-10-08 18:55:06,470 epoch 2 - iter 48/121 - loss 1.55664839 - time (sec): 33.97 - samples/sec: 283.52 - lr: 0.000144 - momentum: 0.000000
2023-10-08 18:55:14,955 epoch 2 - iter 60/121 - loss 1.46630166 - time (sec): 42.45 - samples/sec: 282.70 - lr: 0.000142 - momentum: 0.000000
2023-10-08 18:55:23,934 epoch 2 - iter 72/121 - loss 1.37157843 - time (sec): 51.43 - samples/sec: 286.00 - lr: 0.000140 - momentum: 0.000000
2023-10-08 18:55:33,251 epoch 2 - iter 84/121 - loss 1.29226843 - time (sec): 60.75 - samples/sec: 285.25 - lr: 0.000139 - momentum: 0.000000
2023-10-08 18:55:41,639 epoch 2 - iter 96/121 - loss 1.22297388 - time (sec): 69.14 - samples/sec: 284.94 - lr: 0.000137 - momentum: 0.000000
2023-10-08 18:55:49,868 epoch 2 - iter 108/121 - loss 1.16607756 - time (sec): 77.37 - samples/sec: 284.24 - lr: 0.000135 - momentum: 0.000000
2023-10-08 18:55:58,771 epoch 2 - iter 120/121 - loss 1.10678934 - time (sec): 86.27 - samples/sec: 285.40 - lr: 0.000134 - momentum: 0.000000
2023-10-08 18:55:59,247 ----------------------------------------------------------------------------------------------------
2023-10-08 18:55:59,248 EPOCH 2 done: loss 1.1043 - lr: 0.000134
2023-10-08 18:56:05,007 DEV : loss 0.6626896262168884 - f1-score (micro avg) 0.0
2023-10-08 18:56:05,015 ----------------------------------------------------------------------------------------------------
2023-10-08 18:56:13,901 epoch 3 - iter 12/121 - loss 0.62663391 - time (sec): 8.88 - samples/sec: 291.20 - lr: 0.000132 - momentum: 0.000000
2023-10-08 18:56:22,515 epoch 3 - iter 24/121 - loss 0.65157020 - time (sec): 17.50 - samples/sec: 288.72 - lr: 0.000130 - momentum: 0.000000
2023-10-08 18:56:31,555 epoch 3 - iter 36/121 - loss 0.62153740 - time (sec): 26.54 - samples/sec: 291.77 - lr: 0.000129 - momentum: 0.000000
2023-10-08 18:56:40,024 epoch 3 - iter 48/121 - loss 0.60941090 - time (sec): 35.01 - samples/sec: 287.77 - lr: 0.000127 - momentum: 0.000000
2023-10-08 18:56:48,248 epoch 3 - iter 60/121 - loss 0.58099303 - time (sec): 43.23 - samples/sec: 286.00 - lr: 0.000125 - momentum: 0.000000
2023-10-08 18:56:56,457 epoch 3 - iter 72/121 - loss 0.56416226 - time (sec): 51.44 - samples/sec: 284.89 - lr: 0.000124 - momentum: 0.000000
2023-10-08 18:57:05,639 epoch 3 - iter 84/121 - loss 0.54936968 - time (sec): 60.62 - samples/sec: 285.36 - lr: 0.000122 - momentum: 0.000000
2023-10-08 18:57:14,671 epoch 3 - iter 96/121 - loss 0.53149630 - time (sec): 69.65 - samples/sec: 286.11 - lr: 0.000120 - momentum: 0.000000
2023-10-08 18:57:22,912 epoch 3 - iter 108/121 - loss 0.51386588 - time (sec): 77.90 - samples/sec: 284.83 - lr: 0.000119 - momentum: 0.000000
2023-10-08 18:57:31,553 epoch 3 - iter 120/121 - loss 0.50175773 - time (sec): 86.54 - samples/sec: 284.19 - lr: 0.000117 - momentum: 0.000000
2023-10-08 18:57:32,093 ----------------------------------------------------------------------------------------------------
2023-10-08 18:57:32,093 EPOCH 3 done: loss 0.5006 - lr: 0.000117
2023-10-08 18:57:37,916 DEV : loss 0.38058072328567505 - f1-score (micro avg) 0.1407
2023-10-08 18:57:37,922 saving best model
2023-10-08 18:57:38,825 ----------------------------------------------------------------------------------------------------
2023-10-08 18:57:46,828 epoch 4 - iter 12/121 - loss 0.42696517 - time (sec): 8.00 - samples/sec: 274.65 - lr: 0.000115 - momentum: 0.000000
2023-10-08 18:57:55,827 epoch 4 - iter 24/121 - loss 0.39042726 - time (sec): 16.99 - samples/sec: 284.85 - lr: 0.000114 - momentum: 0.000000
2023-10-08 18:58:04,634 epoch 4 - iter 36/121 - loss 0.36317809 - time (sec): 25.80 - samples/sec: 281.81 - lr: 0.000112 - momentum: 0.000000
2023-10-08 18:58:13,133 epoch 4 - iter 48/121 - loss 0.34968457 - time (sec): 34.30 - samples/sec: 282.27 - lr: 0.000110 - momentum: 0.000000
2023-10-08 18:58:21,453 epoch 4 - iter 60/121 - loss 0.34601826 - time (sec): 42.62 - samples/sec: 282.23 - lr: 0.000109 - momentum: 0.000000
2023-10-08 18:58:31,026 epoch 4 - iter 72/121 - loss 0.33431219 - time (sec): 52.19 - samples/sec: 283.83 - lr: 0.000107 - momentum: 0.000000
2023-10-08 18:58:40,072 epoch 4 - iter 84/121 - loss 0.32217986 - time (sec): 61.24 - samples/sec: 284.02 - lr: 0.000105 - momentum: 0.000000
2023-10-08 18:58:48,661 epoch 4 - iter 96/121 - loss 0.32106188 - time (sec): 69.83 - samples/sec: 282.13 - lr: 0.000104 - momentum: 0.000000
2023-10-08 18:58:57,551 epoch 4 - iter 108/121 - loss 0.32129330 - time (sec): 78.72 - samples/sec: 281.44 - lr: 0.000102 - momentum: 0.000000
2023-10-08 18:59:06,296 epoch 4 - iter 120/121 - loss 0.31202908 - time (sec): 87.46 - samples/sec: 280.57 - lr: 0.000101 - momentum: 0.000000
2023-10-08 18:59:06,956 ----------------------------------------------------------------------------------------------------
2023-10-08 18:59:06,956 EPOCH 4 done: loss 0.3109 - lr: 0.000101
2023-10-08 18:59:12,978 DEV : loss 0.2757081091403961 - f1-score (micro avg) 0.4688
2023-10-08 18:59:12,984 saving best model
2023-10-08 18:59:17,472 ----------------------------------------------------------------------------------------------------
2023-10-08 18:59:26,540 epoch 5 - iter 12/121 - loss 0.30694563 - time (sec): 9.07 - samples/sec: 279.27 - lr: 0.000099 - momentum: 0.000000
2023-10-08 18:59:35,420 epoch 5 - iter 24/121 - loss 0.26663117 - time (sec): 17.95 - samples/sec: 274.54 - lr: 0.000097 - momentum: 0.000000
2023-10-08 18:59:44,953 epoch 5 - iter 36/121 - loss 0.25869032 - time (sec): 27.48 - samples/sec: 280.50 - lr: 0.000095 - momentum: 0.000000
2023-10-08 18:59:54,305 epoch 5 - iter 48/121 - loss 0.25340295 - time (sec): 36.83 - samples/sec: 282.34 - lr: 0.000094 - momentum: 0.000000
2023-10-08 19:00:03,254 epoch 5 - iter 60/121 - loss 0.24849669 - time (sec): 45.78 - samples/sec: 280.75 - lr: 0.000092 - momentum: 0.000000
2023-10-08 19:00:11,614 epoch 5 - iter 72/121 - loss 0.24690412 - time (sec): 54.14 - samples/sec: 279.20 - lr: 0.000091 - momentum: 0.000000
2023-10-08 19:00:20,478 epoch 5 - iter 84/121 - loss 0.23710460 - time (sec): 63.00 - samples/sec: 280.27 - lr: 0.000089 - momentum: 0.000000
2023-10-08 19:00:28,462 epoch 5 - iter 96/121 - loss 0.23501253 - time (sec): 70.99 - samples/sec: 279.30 - lr: 0.000087 - momentum: 0.000000
2023-10-08 19:00:36,907 epoch 5 - iter 108/121 - loss 0.23652123 - time (sec): 79.43 - samples/sec: 279.13 - lr: 0.000086 - momentum: 0.000000
2023-10-08 19:00:45,315 epoch 5 - iter 120/121 - loss 0.23187057 - time (sec): 87.84 - samples/sec: 279.33 - lr: 0.000084 - momentum: 0.000000
2023-10-08 19:00:45,995 ----------------------------------------------------------------------------------------------------
2023-10-08 19:00:45,996 EPOCH 5 done: loss 0.2315 - lr: 0.000084
2023-10-08 19:00:51,813 DEV : loss 0.22175471484661102 - f1-score (micro avg) 0.5361
2023-10-08 19:00:51,818 saving best model
2023-10-08 19:00:52,714 ----------------------------------------------------------------------------------------------------
2023-10-08 19:01:00,947 epoch 6 - iter 12/121 - loss 0.20061369 - time (sec): 8.23 - samples/sec: 273.08 - lr: 0.000082 - momentum: 0.000000
2023-10-08 19:01:09,932 epoch 6 - iter 24/121 - loss 0.19858041 - time (sec): 17.22 - samples/sec: 278.45 - lr: 0.000081 - momentum: 0.000000
2023-10-08 19:01:18,258 epoch 6 - iter 36/121 - loss 0.18757962 - time (sec): 25.54 - samples/sec: 277.85 - lr: 0.000079 - momentum: 0.000000
2023-10-08 19:01:27,129 epoch 6 - iter 48/121 - loss 0.18561409 - time (sec): 34.41 - samples/sec: 279.92 - lr: 0.000077 - momentum: 0.000000
2023-10-08 19:01:35,945 epoch 6 - iter 60/121 - loss 0.18373570 - time (sec): 43.23 - samples/sec: 281.61 - lr: 0.000076 - momentum: 0.000000
2023-10-08 19:01:45,041 epoch 6 - iter 72/121 - loss 0.18088088 - time (sec): 52.33 - samples/sec: 284.43 - lr: 0.000074 - momentum: 0.000000
2023-10-08 19:01:53,654 epoch 6 - iter 84/121 - loss 0.18589651 - time (sec): 60.94 - samples/sec: 286.67 - lr: 0.000072 - momentum: 0.000000
2023-10-08 19:02:01,974 epoch 6 - iter 96/121 - loss 0.18402194 - time (sec): 69.26 - samples/sec: 284.66 - lr: 0.000071 - momentum: 0.000000
2023-10-08 19:02:10,089 epoch 6 - iter 108/121 - loss 0.18024822 - time (sec): 77.37 - samples/sec: 283.81 - lr: 0.000069 - momentum: 0.000000
2023-10-08 19:02:19,152 epoch 6 - iter 120/121 - loss 0.17972159 - time (sec): 86.44 - samples/sec: 283.51 - lr: 0.000067 - momentum: 0.000000
2023-10-08 19:02:19,922 ----------------------------------------------------------------------------------------------------
2023-10-08 19:02:19,923 EPOCH 6 done: loss 0.1795 - lr: 0.000067
2023-10-08 19:02:25,793 DEV : loss 0.19125252962112427 - f1-score (micro avg) 0.724
2023-10-08 19:02:25,798 saving best model
2023-10-08 19:02:30,223 ----------------------------------------------------------------------------------------------------
2023-10-08 19:02:37,676 epoch 7 - iter 12/121 - loss 0.11678108 - time (sec): 7.45 - samples/sec: 262.76 - lr: 0.000066 - momentum: 0.000000
2023-10-08 19:02:46,514 epoch 7 - iter 24/121 - loss 0.13433605 - time (sec): 16.29 - samples/sec: 279.70 - lr: 0.000064 - momentum: 0.000000
2023-10-08 19:02:55,159 epoch 7 - iter 36/121 - loss 0.14325322 - time (sec): 24.93 - samples/sec: 279.98 - lr: 0.000062 - momentum: 0.000000
2023-10-08 19:03:04,166 epoch 7 - iter 48/121 - loss 0.14429225 - time (sec): 33.94 - samples/sec: 282.64 - lr: 0.000061 - momentum: 0.000000
2023-10-08 19:03:12,917 epoch 7 - iter 60/121 - loss 0.14498585 - time (sec): 42.69 - samples/sec: 282.67 - lr: 0.000059 - momentum: 0.000000
2023-10-08 19:03:21,068 epoch 7 - iter 72/121 - loss 0.14181786 - time (sec): 50.84 - samples/sec: 278.56 - lr: 0.000057 - momentum: 0.000000
2023-10-08 19:03:29,919 epoch 7 - iter 84/121 - loss 0.14277978 - time (sec): 59.69 - samples/sec: 278.60 - lr: 0.000056 - momentum: 0.000000
2023-10-08 19:03:38,995 epoch 7 - iter 96/121 - loss 0.14655386 - time (sec): 68.77 - samples/sec: 278.96 - lr: 0.000054 - momentum: 0.000000
2023-10-08 19:03:48,599 epoch 7 - iter 108/121 - loss 0.14929234 - time (sec): 78.37 - samples/sec: 279.57 - lr: 0.000052 - momentum: 0.000000
2023-10-08 19:03:58,039 epoch 7 - iter 120/121 - loss 0.14593260 - time (sec): 87.81 - samples/sec: 279.97 - lr: 0.000051 - momentum: 0.000000
2023-10-08 19:03:58,577 ----------------------------------------------------------------------------------------------------
2023-10-08 19:03:58,577 EPOCH 7 done: loss 0.1457 - lr: 0.000051
2023-10-08 19:04:04,643 DEV : loss 0.1632174551486969 - f1-score (micro avg) 0.8098
2023-10-08 19:04:04,649 saving best model
2023-10-08 19:04:09,016 ----------------------------------------------------------------------------------------------------
2023-10-08 19:04:17,277 epoch 8 - iter 12/121 - loss 0.09837444 - time (sec): 8.26 - samples/sec: 254.50 - lr: 0.000049 - momentum: 0.000000
2023-10-08 19:04:26,315 epoch 8 - iter 24/121 - loss 0.12921094 - time (sec): 17.30 - samples/sec: 268.77 - lr: 0.000047 - momentum: 0.000000
2023-10-08 19:04:35,641 epoch 8 - iter 36/121 - loss 0.12808621 - time (sec): 26.62 - samples/sec: 275.03 - lr: 0.000046 - momentum: 0.000000
2023-10-08 19:04:44,679 epoch 8 - iter 48/121 - loss 0.12197799 - time (sec): 35.66 - samples/sec: 274.22 - lr: 0.000044 - momentum: 0.000000
2023-10-08 19:04:54,047 epoch 8 - iter 60/121 - loss 0.12682628 - time (sec): 45.03 - samples/sec: 273.64 - lr: 0.000042 - momentum: 0.000000
2023-10-08 19:05:03,339 epoch 8 - iter 72/121 - loss 0.12764216 - time (sec): 54.32 - samples/sec: 273.67 - lr: 0.000041 - momentum: 0.000000
2023-10-08 19:05:12,880 epoch 8 - iter 84/121 - loss 0.13190302 - time (sec): 63.86 - samples/sec: 274.00 - lr: 0.000039 - momentum: 0.000000
2023-10-08 19:05:22,310 epoch 8 - iter 96/121 - loss 0.12705137 - time (sec): 73.29 - samples/sec: 272.20 - lr: 0.000038 - momentum: 0.000000
2023-10-08 19:05:31,307 epoch 8 - iter 108/121 - loss 0.12337688 - time (sec): 82.29 - samples/sec: 270.65 - lr: 0.000036 - momentum: 0.000000
2023-10-08 19:05:40,317 epoch 8 - iter 120/121 - loss 0.12500486 - time (sec): 91.30 - samples/sec: 269.73 - lr: 0.000034 - momentum: 0.000000
2023-10-08 19:05:40,790 ----------------------------------------------------------------------------------------------------
2023-10-08 19:05:40,790 EPOCH 8 done: loss 0.1249 - lr: 0.000034
2023-10-08 19:05:47,262 DEV : loss 0.1568206399679184 - f1-score (micro avg) 0.8055
2023-10-08 19:05:47,268 ----------------------------------------------------------------------------------------------------
2023-10-08 19:05:55,943 epoch 9 - iter 12/121 - loss 0.14458427 - time (sec): 8.67 - samples/sec: 257.36 - lr: 0.000032 - momentum: 0.000000
2023-10-08 19:06:04,681 epoch 9 - iter 24/121 - loss 0.12322166 - time (sec): 17.41 - samples/sec: 252.60 - lr: 0.000031 - momentum: 0.000000
2023-10-08 19:06:14,414 epoch 9 - iter 36/121 - loss 0.12266578 - time (sec): 27.14 - samples/sec: 254.42 - lr: 0.000029 - momentum: 0.000000
2023-10-08 19:06:23,813 epoch 9 - iter 48/121 - loss 0.12842266 - time (sec): 36.54 - samples/sec: 258.63 - lr: 0.000028 - momentum: 0.000000
2023-10-08 19:06:33,665 epoch 9 - iter 60/121 - loss 0.12385669 - time (sec): 46.39 - samples/sec: 261.24 - lr: 0.000026 - momentum: 0.000000
2023-10-08 19:06:43,548 epoch 9 - iter 72/121 - loss 0.12174150 - time (sec): 56.28 - samples/sec: 261.01 - lr: 0.000024 - momentum: 0.000000
2023-10-08 19:06:53,003 epoch 9 - iter 84/121 - loss 0.11693716 - time (sec): 65.73 - samples/sec: 261.51 - lr: 0.000023 - momentum: 0.000000
2023-10-08 19:07:02,162 epoch 9 - iter 96/121 - loss 0.11507909 - time (sec): 74.89 - samples/sec: 261.41 - lr: 0.000021 - momentum: 0.000000
2023-10-08 19:07:11,512 epoch 9 - iter 108/121 - loss 0.11452227 - time (sec): 84.24 - samples/sec: 261.45 - lr: 0.000019 - momentum: 0.000000
2023-10-08 19:07:21,019 epoch 9 - iter 120/121 - loss 0.11068693 - time (sec): 93.75 - samples/sec: 261.92 - lr: 0.000018 - momentum: 0.000000
2023-10-08 19:07:21,664 ----------------------------------------------------------------------------------------------------
2023-10-08 19:07:21,665 EPOCH 9 done: loss 0.1111 - lr: 0.000018
2023-10-08 19:07:28,328 DEV : loss 0.1590316891670227 - f1-score (micro avg) 0.8085
2023-10-08 19:07:28,334 ----------------------------------------------------------------------------------------------------
2023-10-08 19:07:37,743 epoch 10 - iter 12/121 - loss 0.09586795 - time (sec): 9.41 - samples/sec: 260.65 - lr: 0.000016 - momentum: 0.000000
2023-10-08 19:07:47,445 epoch 10 - iter 24/121 - loss 0.09619648 - time (sec): 19.11 - samples/sec: 265.15 - lr: 0.000014 - momentum: 0.000000
2023-10-08 19:07:56,745 epoch 10 - iter 36/121 - loss 0.09318596 - time (sec): 28.41 - samples/sec: 265.02 - lr: 0.000013 - momentum: 0.000000
2023-10-08 19:08:06,253 epoch 10 - iter 48/121 - loss 0.10154873 - time (sec): 37.92 - samples/sec: 266.37 - lr: 0.000011 - momentum: 0.000000
2023-10-08 19:08:14,801 epoch 10 - iter 60/121 - loss 0.10200519 - time (sec): 46.47 - samples/sec: 262.99 - lr: 0.000009 - momentum: 0.000000
2023-10-08 19:08:23,793 epoch 10 - iter 72/121 - loss 0.10058613 - time (sec): 55.46 - samples/sec: 261.66 - lr: 0.000008 - momentum: 0.000000
2023-10-08 19:08:33,173 epoch 10 - iter 84/121 - loss 0.10061131 - time (sec): 64.84 - samples/sec: 261.53 - lr: 0.000006 - momentum: 0.000000
2023-10-08 19:08:42,680 epoch 10 - iter 96/121 - loss 0.10093060 - time (sec): 74.34 - samples/sec: 262.01 - lr: 0.000004 - momentum: 0.000000
2023-10-08 19:08:52,153 epoch 10 - iter 108/121 - loss 0.09937196 - time (sec): 83.82 - samples/sec: 262.14 - lr: 0.000003 - momentum: 0.000000
2023-10-08 19:09:01,711 epoch 10 - iter 120/121 - loss 0.10257863 - time (sec): 93.38 - samples/sec: 263.21 - lr: 0.000001 - momentum: 0.000000
2023-10-08 19:09:02,402 ----------------------------------------------------------------------------------------------------
2023-10-08 19:09:02,403 EPOCH 10 done: loss 0.1026 - lr: 0.000001
2023-10-08 19:09:08,816 DEV : loss 0.1557578295469284 - f1-score (micro avg) 0.787
2023-10-08 19:09:09,673 ----------------------------------------------------------------------------------------------------
2023-10-08 19:09:09,674 Loading model from best epoch ...
2023-10-08 19:09:12,765 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-08 19:09:19,078
Results:
- F-score (micro) 0.7643
- F-score (macro) 0.4572
- Accuracy 0.6523
By class:
precision recall f1-score support
pers 0.7391 0.8561 0.7933 139
scope 0.7740 0.8760 0.8218 129
work 0.6548 0.6875 0.6707 80
loc 0.0000 0.0000 0.0000 9
date 0.0000 0.0000 0.0000 3
micro avg 0.7340 0.7972 0.7643 360
macro avg 0.4336 0.4839 0.4572 360
weighted avg 0.7082 0.7972 0.7499 360
2023-10-08 19:09:19,078 ----------------------------------------------------------------------------------------------------