stefan-it's picture
Upload folder using huggingface_hub
8a0cf1d
2023-10-13 01:26:15,069 ----------------------------------------------------------------------------------------------------
2023-10-13 01:26:15,072 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 01:26:15,072 ----------------------------------------------------------------------------------------------------
2023-10-13 01:26:15,072 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
2023-10-13 01:26:15,072 ----------------------------------------------------------------------------------------------------
2023-10-13 01:26:15,072 Train: 14465 sentences
2023-10-13 01:26:15,072 (train_with_dev=False, train_with_test=False)
2023-10-13 01:26:15,072 ----------------------------------------------------------------------------------------------------
2023-10-13 01:26:15,072 Training Params:
2023-10-13 01:26:15,072 - learning_rate: "0.00016"
2023-10-13 01:26:15,072 - mini_batch_size: "8"
2023-10-13 01:26:15,073 - max_epochs: "10"
2023-10-13 01:26:15,073 - shuffle: "True"
2023-10-13 01:26:15,073 ----------------------------------------------------------------------------------------------------
2023-10-13 01:26:15,073 Plugins:
2023-10-13 01:26:15,073 - TensorboardLogger
2023-10-13 01:26:15,073 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 01:26:15,073 ----------------------------------------------------------------------------------------------------
2023-10-13 01:26:15,073 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 01:26:15,073 - metric: "('micro avg', 'f1-score')"
2023-10-13 01:26:15,073 ----------------------------------------------------------------------------------------------------
2023-10-13 01:26:15,073 Computation:
2023-10-13 01:26:15,073 - compute on device: cuda:0
2023-10-13 01:26:15,073 - embedding storage: none
2023-10-13 01:26:15,073 ----------------------------------------------------------------------------------------------------
2023-10-13 01:26:15,073 Model training base path: "hmbench-letemps/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-1"
2023-10-13 01:26:15,074 ----------------------------------------------------------------------------------------------------
2023-10-13 01:26:15,074 ----------------------------------------------------------------------------------------------------
2023-10-13 01:26:15,074 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-13 01:27:50,591 epoch 1 - iter 180/1809 - loss 2.57027178 - time (sec): 95.52 - samples/sec: 402.43 - lr: 0.000016 - momentum: 0.000000
2023-10-13 01:29:25,564 epoch 1 - iter 360/1809 - loss 2.33739383 - time (sec): 190.49 - samples/sec: 398.49 - lr: 0.000032 - momentum: 0.000000
2023-10-13 01:30:59,649 epoch 1 - iter 540/1809 - loss 1.98337869 - time (sec): 284.57 - samples/sec: 396.72 - lr: 0.000048 - momentum: 0.000000
2023-10-13 01:32:35,319 epoch 1 - iter 720/1809 - loss 1.61896865 - time (sec): 380.24 - samples/sec: 398.89 - lr: 0.000064 - momentum: 0.000000
2023-10-13 01:34:08,395 epoch 1 - iter 900/1809 - loss 1.35128031 - time (sec): 473.32 - samples/sec: 400.08 - lr: 0.000080 - momentum: 0.000000
2023-10-13 01:35:39,939 epoch 1 - iter 1080/1809 - loss 1.16606342 - time (sec): 564.86 - samples/sec: 400.35 - lr: 0.000095 - momentum: 0.000000
2023-10-13 01:37:11,149 epoch 1 - iter 1260/1809 - loss 1.02972052 - time (sec): 656.07 - samples/sec: 400.53 - lr: 0.000111 - momentum: 0.000000
2023-10-13 01:38:42,954 epoch 1 - iter 1440/1809 - loss 0.92088707 - time (sec): 747.88 - samples/sec: 402.22 - lr: 0.000127 - momentum: 0.000000
2023-10-13 01:40:18,744 epoch 1 - iter 1620/1809 - loss 0.83665554 - time (sec): 843.67 - samples/sec: 401.67 - lr: 0.000143 - momentum: 0.000000
2023-10-13 01:41:56,057 epoch 1 - iter 1800/1809 - loss 0.76361147 - time (sec): 940.98 - samples/sec: 401.57 - lr: 0.000159 - momentum: 0.000000
2023-10-13 01:42:00,723 ----------------------------------------------------------------------------------------------------
2023-10-13 01:42:00,723 EPOCH 1 done: loss 0.7603 - lr: 0.000159
2023-10-13 01:42:37,965 DEV : loss 0.14501185715198517 - f1-score (micro avg) 0.4122
2023-10-13 01:42:38,027 saving best model
2023-10-13 01:42:38,900 ----------------------------------------------------------------------------------------------------
2023-10-13 01:44:12,141 epoch 2 - iter 180/1809 - loss 0.11022017 - time (sec): 93.24 - samples/sec: 415.46 - lr: 0.000158 - momentum: 0.000000
2023-10-13 01:45:45,611 epoch 2 - iter 360/1809 - loss 0.11221991 - time (sec): 186.71 - samples/sec: 414.87 - lr: 0.000156 - momentum: 0.000000
2023-10-13 01:47:16,997 epoch 2 - iter 540/1809 - loss 0.10902795 - time (sec): 278.09 - samples/sec: 414.67 - lr: 0.000155 - momentum: 0.000000
2023-10-13 01:48:48,299 epoch 2 - iter 720/1809 - loss 0.10834577 - time (sec): 369.40 - samples/sec: 411.87 - lr: 0.000153 - momentum: 0.000000
2023-10-13 01:50:21,166 epoch 2 - iter 900/1809 - loss 0.10540217 - time (sec): 462.26 - samples/sec: 407.82 - lr: 0.000151 - momentum: 0.000000
2023-10-13 01:51:51,954 epoch 2 - iter 1080/1809 - loss 0.10504549 - time (sec): 553.05 - samples/sec: 410.17 - lr: 0.000149 - momentum: 0.000000
2023-10-13 01:53:21,268 epoch 2 - iter 1260/1809 - loss 0.10280905 - time (sec): 642.37 - samples/sec: 411.75 - lr: 0.000148 - momentum: 0.000000
2023-10-13 01:54:50,585 epoch 2 - iter 1440/1809 - loss 0.10065484 - time (sec): 731.68 - samples/sec: 413.77 - lr: 0.000146 - momentum: 0.000000
2023-10-13 01:56:20,686 epoch 2 - iter 1620/1809 - loss 0.09800215 - time (sec): 821.78 - samples/sec: 415.31 - lr: 0.000144 - momentum: 0.000000
2023-10-13 01:57:49,286 epoch 2 - iter 1800/1809 - loss 0.09732421 - time (sec): 910.38 - samples/sec: 415.28 - lr: 0.000142 - momentum: 0.000000
2023-10-13 01:57:53,403 ----------------------------------------------------------------------------------------------------
2023-10-13 01:57:53,404 EPOCH 2 done: loss 0.0971 - lr: 0.000142
2023-10-13 01:58:32,248 DEV : loss 0.10631529986858368 - f1-score (micro avg) 0.618
2023-10-13 01:58:32,304 saving best model
2023-10-13 01:58:34,913 ----------------------------------------------------------------------------------------------------
2023-10-13 02:00:04,640 epoch 3 - iter 180/1809 - loss 0.06167102 - time (sec): 89.72 - samples/sec: 425.26 - lr: 0.000140 - momentum: 0.000000
2023-10-13 02:01:36,034 epoch 3 - iter 360/1809 - loss 0.06052679 - time (sec): 181.12 - samples/sec: 422.80 - lr: 0.000139 - momentum: 0.000000
2023-10-13 02:03:04,768 epoch 3 - iter 540/1809 - loss 0.06164459 - time (sec): 269.85 - samples/sec: 419.93 - lr: 0.000137 - momentum: 0.000000
2023-10-13 02:04:33,141 epoch 3 - iter 720/1809 - loss 0.06028257 - time (sec): 358.22 - samples/sec: 420.01 - lr: 0.000135 - momentum: 0.000000
2023-10-13 02:06:04,511 epoch 3 - iter 900/1809 - loss 0.06093924 - time (sec): 449.59 - samples/sec: 418.97 - lr: 0.000133 - momentum: 0.000000
2023-10-13 02:07:33,194 epoch 3 - iter 1080/1809 - loss 0.06123599 - time (sec): 538.28 - samples/sec: 420.59 - lr: 0.000132 - momentum: 0.000000
2023-10-13 02:09:05,494 epoch 3 - iter 1260/1809 - loss 0.06096843 - time (sec): 630.58 - samples/sec: 420.26 - lr: 0.000130 - momentum: 0.000000
2023-10-13 02:10:36,079 epoch 3 - iter 1440/1809 - loss 0.06042058 - time (sec): 721.16 - samples/sec: 419.13 - lr: 0.000128 - momentum: 0.000000
2023-10-13 02:12:05,652 epoch 3 - iter 1620/1809 - loss 0.06085687 - time (sec): 810.73 - samples/sec: 419.46 - lr: 0.000126 - momentum: 0.000000
2023-10-13 02:13:34,541 epoch 3 - iter 1800/1809 - loss 0.06010337 - time (sec): 899.62 - samples/sec: 420.31 - lr: 0.000125 - momentum: 0.000000
2023-10-13 02:13:38,557 ----------------------------------------------------------------------------------------------------
2023-10-13 02:13:38,557 EPOCH 3 done: loss 0.0600 - lr: 0.000125
2023-10-13 02:14:17,081 DEV : loss 0.1486276537179947 - f1-score (micro avg) 0.6279
2023-10-13 02:14:17,138 saving best model
2023-10-13 02:14:19,719 ----------------------------------------------------------------------------------------------------
2023-10-13 02:15:49,206 epoch 4 - iter 180/1809 - loss 0.04438082 - time (sec): 89.48 - samples/sec: 412.12 - lr: 0.000123 - momentum: 0.000000
2023-10-13 02:17:20,752 epoch 4 - iter 360/1809 - loss 0.04634535 - time (sec): 181.03 - samples/sec: 421.41 - lr: 0.000121 - momentum: 0.000000
2023-10-13 02:18:53,652 epoch 4 - iter 540/1809 - loss 0.04429804 - time (sec): 273.93 - samples/sec: 414.29 - lr: 0.000119 - momentum: 0.000000
2023-10-13 02:20:26,564 epoch 4 - iter 720/1809 - loss 0.04282892 - time (sec): 366.84 - samples/sec: 410.30 - lr: 0.000117 - momentum: 0.000000
2023-10-13 02:21:58,754 epoch 4 - iter 900/1809 - loss 0.04357623 - time (sec): 459.03 - samples/sec: 408.03 - lr: 0.000116 - momentum: 0.000000
2023-10-13 02:23:32,299 epoch 4 - iter 1080/1809 - loss 0.04492686 - time (sec): 552.57 - samples/sec: 407.98 - lr: 0.000114 - momentum: 0.000000
2023-10-13 02:25:06,137 epoch 4 - iter 1260/1809 - loss 0.04500505 - time (sec): 646.41 - samples/sec: 406.96 - lr: 0.000112 - momentum: 0.000000
2023-10-13 02:26:42,107 epoch 4 - iter 1440/1809 - loss 0.04465500 - time (sec): 742.38 - samples/sec: 405.46 - lr: 0.000110 - momentum: 0.000000
2023-10-13 02:28:20,025 epoch 4 - iter 1620/1809 - loss 0.04378505 - time (sec): 840.30 - samples/sec: 405.00 - lr: 0.000109 - momentum: 0.000000
2023-10-13 02:29:56,157 epoch 4 - iter 1800/1809 - loss 0.04381280 - time (sec): 936.43 - samples/sec: 403.84 - lr: 0.000107 - momentum: 0.000000
2023-10-13 02:30:00,550 ----------------------------------------------------------------------------------------------------
2023-10-13 02:30:00,551 EPOCH 4 done: loss 0.0440 - lr: 0.000107
2023-10-13 02:30:42,177 DEV : loss 0.1783849447965622 - f1-score (micro avg) 0.5908
2023-10-13 02:30:42,244 ----------------------------------------------------------------------------------------------------
2023-10-13 02:32:18,971 epoch 5 - iter 180/1809 - loss 0.02710856 - time (sec): 96.72 - samples/sec: 396.17 - lr: 0.000105 - momentum: 0.000000
2023-10-13 02:33:56,021 epoch 5 - iter 360/1809 - loss 0.02681704 - time (sec): 193.77 - samples/sec: 395.65 - lr: 0.000103 - momentum: 0.000000
2023-10-13 02:35:30,935 epoch 5 - iter 540/1809 - loss 0.02671099 - time (sec): 288.69 - samples/sec: 392.24 - lr: 0.000101 - momentum: 0.000000
2023-10-13 02:37:05,596 epoch 5 - iter 720/1809 - loss 0.02957950 - time (sec): 383.35 - samples/sec: 390.35 - lr: 0.000100 - momentum: 0.000000
2023-10-13 02:38:39,902 epoch 5 - iter 900/1809 - loss 0.03107445 - time (sec): 477.66 - samples/sec: 391.89 - lr: 0.000098 - momentum: 0.000000
2023-10-13 02:40:15,203 epoch 5 - iter 1080/1809 - loss 0.03053907 - time (sec): 572.96 - samples/sec: 393.23 - lr: 0.000096 - momentum: 0.000000
2023-10-13 02:41:51,148 epoch 5 - iter 1260/1809 - loss 0.03099549 - time (sec): 668.90 - samples/sec: 392.67 - lr: 0.000094 - momentum: 0.000000
2023-10-13 02:43:24,310 epoch 5 - iter 1440/1809 - loss 0.03239734 - time (sec): 762.06 - samples/sec: 393.07 - lr: 0.000093 - momentum: 0.000000
2023-10-13 02:45:01,678 epoch 5 - iter 1620/1809 - loss 0.03173686 - time (sec): 859.43 - samples/sec: 395.54 - lr: 0.000091 - momentum: 0.000000
2023-10-13 02:46:39,758 epoch 5 - iter 1800/1809 - loss 0.03205991 - time (sec): 957.51 - samples/sec: 394.91 - lr: 0.000089 - momentum: 0.000000
2023-10-13 02:46:44,304 ----------------------------------------------------------------------------------------------------
2023-10-13 02:46:44,305 EPOCH 5 done: loss 0.0320 - lr: 0.000089
2023-10-13 02:47:26,896 DEV : loss 0.2254001647233963 - f1-score (micro avg) 0.6268
2023-10-13 02:47:26,979 ----------------------------------------------------------------------------------------------------
2023-10-13 02:49:02,907 epoch 6 - iter 180/1809 - loss 0.02309495 - time (sec): 95.92 - samples/sec: 392.21 - lr: 0.000087 - momentum: 0.000000
2023-10-13 02:50:42,922 epoch 6 - iter 360/1809 - loss 0.02270880 - time (sec): 195.94 - samples/sec: 386.67 - lr: 0.000085 - momentum: 0.000000
2023-10-13 02:52:19,953 epoch 6 - iter 540/1809 - loss 0.02278981 - time (sec): 292.97 - samples/sec: 386.21 - lr: 0.000084 - momentum: 0.000000
2023-10-13 02:53:57,447 epoch 6 - iter 720/1809 - loss 0.02331295 - time (sec): 390.47 - samples/sec: 387.70 - lr: 0.000082 - momentum: 0.000000
2023-10-13 02:55:33,309 epoch 6 - iter 900/1809 - loss 0.02407055 - time (sec): 486.33 - samples/sec: 389.86 - lr: 0.000080 - momentum: 0.000000
2023-10-13 02:57:09,646 epoch 6 - iter 1080/1809 - loss 0.02408103 - time (sec): 582.66 - samples/sec: 389.96 - lr: 0.000078 - momentum: 0.000000
2023-10-13 02:58:45,520 epoch 6 - iter 1260/1809 - loss 0.02505139 - time (sec): 678.54 - samples/sec: 390.43 - lr: 0.000077 - momentum: 0.000000
2023-10-13 03:00:20,100 epoch 6 - iter 1440/1809 - loss 0.02498905 - time (sec): 773.12 - samples/sec: 389.95 - lr: 0.000075 - momentum: 0.000000
2023-10-13 03:01:52,543 epoch 6 - iter 1620/1809 - loss 0.02440205 - time (sec): 865.56 - samples/sec: 391.85 - lr: 0.000073 - momentum: 0.000000
2023-10-13 03:03:26,329 epoch 6 - iter 1800/1809 - loss 0.02416447 - time (sec): 959.35 - samples/sec: 394.14 - lr: 0.000071 - momentum: 0.000000
2023-10-13 03:03:30,703 ----------------------------------------------------------------------------------------------------
2023-10-13 03:03:30,704 EPOCH 6 done: loss 0.0241 - lr: 0.000071
2023-10-13 03:04:12,643 DEV : loss 0.26813653111457825 - f1-score (micro avg) 0.6519
2023-10-13 03:04:12,704 saving best model
2023-10-13 03:04:15,437 ----------------------------------------------------------------------------------------------------
2023-10-13 03:05:49,546 epoch 7 - iter 180/1809 - loss 0.01558310 - time (sec): 94.11 - samples/sec: 392.76 - lr: 0.000069 - momentum: 0.000000
2023-10-13 03:07:23,547 epoch 7 - iter 360/1809 - loss 0.01547849 - time (sec): 188.11 - samples/sec: 402.58 - lr: 0.000068 - momentum: 0.000000
2023-10-13 03:08:56,013 epoch 7 - iter 540/1809 - loss 0.01669214 - time (sec): 280.57 - samples/sec: 404.44 - lr: 0.000066 - momentum: 0.000000
2023-10-13 03:10:28,801 epoch 7 - iter 720/1809 - loss 0.01856464 - time (sec): 373.36 - samples/sec: 404.84 - lr: 0.000064 - momentum: 0.000000
2023-10-13 03:12:00,985 epoch 7 - iter 900/1809 - loss 0.01851849 - time (sec): 465.54 - samples/sec: 405.68 - lr: 0.000062 - momentum: 0.000000
2023-10-13 03:13:32,908 epoch 7 - iter 1080/1809 - loss 0.01798567 - time (sec): 557.47 - samples/sec: 405.34 - lr: 0.000061 - momentum: 0.000000
2023-10-13 03:15:06,435 epoch 7 - iter 1260/1809 - loss 0.01767135 - time (sec): 650.99 - samples/sec: 405.31 - lr: 0.000059 - momentum: 0.000000
2023-10-13 03:16:41,207 epoch 7 - iter 1440/1809 - loss 0.01898520 - time (sec): 745.77 - samples/sec: 404.12 - lr: 0.000057 - momentum: 0.000000
2023-10-13 03:18:16,201 epoch 7 - iter 1620/1809 - loss 0.01916789 - time (sec): 840.76 - samples/sec: 404.08 - lr: 0.000055 - momentum: 0.000000
2023-10-13 03:19:50,732 epoch 7 - iter 1800/1809 - loss 0.01895459 - time (sec): 935.29 - samples/sec: 404.42 - lr: 0.000053 - momentum: 0.000000
2023-10-13 03:19:55,054 ----------------------------------------------------------------------------------------------------
2023-10-13 03:19:55,054 EPOCH 7 done: loss 0.0189 - lr: 0.000053
2023-10-13 03:20:35,004 DEV : loss 0.29598313570022583 - f1-score (micro avg) 0.6553
2023-10-13 03:20:35,066 saving best model
2023-10-13 03:20:37,700 ----------------------------------------------------------------------------------------------------
2023-10-13 03:22:10,318 epoch 8 - iter 180/1809 - loss 0.01284778 - time (sec): 92.61 - samples/sec: 405.30 - lr: 0.000052 - momentum: 0.000000
2023-10-13 03:23:42,134 epoch 8 - iter 360/1809 - loss 0.01152205 - time (sec): 184.43 - samples/sec: 411.62 - lr: 0.000050 - momentum: 0.000000
2023-10-13 03:25:17,893 epoch 8 - iter 540/1809 - loss 0.01144334 - time (sec): 280.19 - samples/sec: 406.68 - lr: 0.000048 - momentum: 0.000000
2023-10-13 03:26:55,544 epoch 8 - iter 720/1809 - loss 0.01247695 - time (sec): 377.84 - samples/sec: 404.95 - lr: 0.000046 - momentum: 0.000000
2023-10-13 03:28:29,861 epoch 8 - iter 900/1809 - loss 0.01242235 - time (sec): 472.16 - samples/sec: 405.85 - lr: 0.000044 - momentum: 0.000000
2023-10-13 03:30:00,572 epoch 8 - iter 1080/1809 - loss 0.01249440 - time (sec): 562.87 - samples/sec: 404.47 - lr: 0.000043 - momentum: 0.000000
2023-10-13 03:31:32,676 epoch 8 - iter 1260/1809 - loss 0.01284750 - time (sec): 654.97 - samples/sec: 404.21 - lr: 0.000041 - momentum: 0.000000
2023-10-13 03:33:05,856 epoch 8 - iter 1440/1809 - loss 0.01286960 - time (sec): 748.15 - samples/sec: 405.03 - lr: 0.000039 - momentum: 0.000000
2023-10-13 03:34:39,321 epoch 8 - iter 1620/1809 - loss 0.01296803 - time (sec): 841.62 - samples/sec: 405.50 - lr: 0.000037 - momentum: 0.000000
2023-10-13 03:36:14,046 epoch 8 - iter 1800/1809 - loss 0.01287811 - time (sec): 936.34 - samples/sec: 404.13 - lr: 0.000036 - momentum: 0.000000
2023-10-13 03:36:18,143 ----------------------------------------------------------------------------------------------------
2023-10-13 03:36:18,143 EPOCH 8 done: loss 0.0128 - lr: 0.000036
2023-10-13 03:36:57,138 DEV : loss 0.33492255210876465 - f1-score (micro avg) 0.647
2023-10-13 03:36:57,201 ----------------------------------------------------------------------------------------------------
2023-10-13 03:38:34,200 epoch 9 - iter 180/1809 - loss 0.00784411 - time (sec): 97.00 - samples/sec: 383.03 - lr: 0.000034 - momentum: 0.000000
2023-10-13 03:40:11,602 epoch 9 - iter 360/1809 - loss 0.01133300 - time (sec): 194.40 - samples/sec: 384.49 - lr: 0.000032 - momentum: 0.000000
2023-10-13 03:41:47,232 epoch 9 - iter 540/1809 - loss 0.01015941 - time (sec): 290.03 - samples/sec: 387.16 - lr: 0.000030 - momentum: 0.000000
2023-10-13 03:43:23,081 epoch 9 - iter 720/1809 - loss 0.01025025 - time (sec): 385.88 - samples/sec: 394.75 - lr: 0.000028 - momentum: 0.000000
2023-10-13 03:44:57,677 epoch 9 - iter 900/1809 - loss 0.01070525 - time (sec): 480.47 - samples/sec: 394.93 - lr: 0.000027 - momentum: 0.000000
2023-10-13 03:46:31,742 epoch 9 - iter 1080/1809 - loss 0.01028318 - time (sec): 574.54 - samples/sec: 396.01 - lr: 0.000025 - momentum: 0.000000
2023-10-13 03:48:04,906 epoch 9 - iter 1260/1809 - loss 0.01021383 - time (sec): 667.70 - samples/sec: 396.05 - lr: 0.000023 - momentum: 0.000000
2023-10-13 03:49:39,964 epoch 9 - iter 1440/1809 - loss 0.01089912 - time (sec): 762.76 - samples/sec: 396.96 - lr: 0.000021 - momentum: 0.000000
2023-10-13 03:51:13,537 epoch 9 - iter 1620/1809 - loss 0.01101134 - time (sec): 856.33 - samples/sec: 398.65 - lr: 0.000020 - momentum: 0.000000
2023-10-13 03:52:47,726 epoch 9 - iter 1800/1809 - loss 0.01064439 - time (sec): 950.52 - samples/sec: 398.11 - lr: 0.000018 - momentum: 0.000000
2023-10-13 03:52:51,949 ----------------------------------------------------------------------------------------------------
2023-10-13 03:52:51,950 EPOCH 9 done: loss 0.0106 - lr: 0.000018
2023-10-13 03:53:31,082 DEV : loss 0.3527080714702606 - f1-score (micro avg) 0.6497
2023-10-13 03:53:31,150 ----------------------------------------------------------------------------------------------------
2023-10-13 03:55:10,271 epoch 10 - iter 180/1809 - loss 0.01074091 - time (sec): 99.12 - samples/sec: 381.35 - lr: 0.000016 - momentum: 0.000000
2023-10-13 03:56:50,888 epoch 10 - iter 360/1809 - loss 0.00827756 - time (sec): 199.74 - samples/sec: 375.73 - lr: 0.000014 - momentum: 0.000000
2023-10-13 03:58:30,839 epoch 10 - iter 540/1809 - loss 0.00823781 - time (sec): 299.69 - samples/sec: 376.56 - lr: 0.000012 - momentum: 0.000000
2023-10-13 04:00:10,229 epoch 10 - iter 720/1809 - loss 0.00811760 - time (sec): 399.08 - samples/sec: 375.88 - lr: 0.000011 - momentum: 0.000000
2023-10-13 04:01:47,773 epoch 10 - iter 900/1809 - loss 0.00799705 - time (sec): 496.62 - samples/sec: 378.88 - lr: 0.000009 - momentum: 0.000000
2023-10-13 04:03:23,198 epoch 10 - iter 1080/1809 - loss 0.00755984 - time (sec): 592.05 - samples/sec: 382.04 - lr: 0.000007 - momentum: 0.000000
2023-10-13 04:04:58,210 epoch 10 - iter 1260/1809 - loss 0.00766798 - time (sec): 687.06 - samples/sec: 384.62 - lr: 0.000005 - momentum: 0.000000
2023-10-13 04:06:32,716 epoch 10 - iter 1440/1809 - loss 0.00809054 - time (sec): 781.56 - samples/sec: 387.08 - lr: 0.000004 - momentum: 0.000000
2023-10-13 04:08:08,141 epoch 10 - iter 1620/1809 - loss 0.00862815 - time (sec): 876.99 - samples/sec: 388.22 - lr: 0.000002 - momentum: 0.000000
2023-10-13 04:09:43,867 epoch 10 - iter 1800/1809 - loss 0.00839058 - time (sec): 972.71 - samples/sec: 389.06 - lr: 0.000000 - momentum: 0.000000
2023-10-13 04:09:48,012 ----------------------------------------------------------------------------------------------------
2023-10-13 04:09:48,012 EPOCH 10 done: loss 0.0084 - lr: 0.000000
2023-10-13 04:10:26,982 DEV : loss 0.3519783318042755 - f1-score (micro avg) 0.6454
2023-10-13 04:10:27,904 ----------------------------------------------------------------------------------------------------
2023-10-13 04:10:27,906 Loading model from best epoch ...
2023-10-13 04:10:32,462 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
2023-10-13 04:11:31,927
Results:
- F-score (micro) 0.6338
- F-score (macro) 0.4822
- Accuracy 0.4769
By class:
precision recall f1-score support
loc 0.6496 0.7530 0.6975 591
pers 0.5405 0.7479 0.6275 357
org 0.1304 0.1139 0.1216 79
micro avg 0.5777 0.7020 0.6338 1027
macro avg 0.4402 0.5383 0.4822 1027
weighted avg 0.5718 0.7020 0.6289 1027
2023-10-13 04:11:31,927 ----------------------------------------------------------------------------------------------------