Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- final-model.pt +3 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697252212.c8b2203b18a8.2923.10 +3 -0
- test.tsv +0 -0
- training.log +262 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:23b04e2ab00e6045f6718fbe5246cfc010da5937b7353e76f4e27a1a6aba80dd
|
3 |
+
size 870793839
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
final-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7d69c2a195260134f00becd510b8463914c52277115fede488417ad7be735543
|
3 |
+
size 870793956
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 03:14:03 0.0001 0.6115 0.1287 0.5271 0.7346 0.6138 0.4515
|
3 |
+
2 03:31:21 0.0001 0.0899 0.1199 0.5292 0.7666 0.6262 0.4621
|
4 |
+
3 03:50:10 0.0001 0.0640 0.1678 0.5869 0.7185 0.6461 0.4846
|
5 |
+
4 04:08:52 0.0001 0.0467 0.2165 0.5360 0.7838 0.6366 0.4777
|
6 |
+
5 04:27:17 0.0001 0.0326 0.2349 0.5756 0.7231 0.6410 0.4810
|
7 |
+
6 04:45:30 0.0001 0.0227 0.2885 0.5774 0.7471 0.6514 0.4921
|
8 |
+
7 05:03:46 0.0001 0.0148 0.3005 0.5816 0.7300 0.6474 0.4863
|
9 |
+
8 05:21:56 0.0000 0.0097 0.3371 0.5707 0.7529 0.6492 0.4899
|
10 |
+
9 05:39:46 0.0000 0.0066 0.3554 0.5692 0.7574 0.6500 0.4911
|
11 |
+
10 05:57:55 0.0000 0.0045 0.3686 0.5684 0.7700 0.6540 0.4959
|
runs/events.out.tfevents.1697252212.c8b2203b18a8.2923.10
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c03fb63bc260e5c68d94fef959e1cc9efaac9ccf749366ede5479101d17afbb8
|
3 |
+
size 2030580
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,262 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-14 02:56:52,966 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-14 02:56:52,968 Model: "SequenceTagger(
|
3 |
+
(embeddings): ByT5Embeddings(
|
4 |
+
(model): T5EncoderModel(
|
5 |
+
(shared): Embedding(384, 1472)
|
6 |
+
(encoder): T5Stack(
|
7 |
+
(embed_tokens): Embedding(384, 1472)
|
8 |
+
(block): ModuleList(
|
9 |
+
(0): T5Block(
|
10 |
+
(layer): ModuleList(
|
11 |
+
(0): T5LayerSelfAttention(
|
12 |
+
(SelfAttention): T5Attention(
|
13 |
+
(q): Linear(in_features=1472, out_features=384, bias=False)
|
14 |
+
(k): Linear(in_features=1472, out_features=384, bias=False)
|
15 |
+
(v): Linear(in_features=1472, out_features=384, bias=False)
|
16 |
+
(o): Linear(in_features=384, out_features=1472, bias=False)
|
17 |
+
(relative_attention_bias): Embedding(32, 6)
|
18 |
+
)
|
19 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(1): T5LayerFF(
|
23 |
+
(DenseReluDense): T5DenseGatedActDense(
|
24 |
+
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
|
25 |
+
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
|
26 |
+
(wo): Linear(in_features=3584, out_features=1472, bias=False)
|
27 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
28 |
+
(act): NewGELUActivation()
|
29 |
+
)
|
30 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
31 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
32 |
+
)
|
33 |
+
)
|
34 |
+
)
|
35 |
+
(1-11): 11 x T5Block(
|
36 |
+
(layer): ModuleList(
|
37 |
+
(0): T5LayerSelfAttention(
|
38 |
+
(SelfAttention): T5Attention(
|
39 |
+
(q): Linear(in_features=1472, out_features=384, bias=False)
|
40 |
+
(k): Linear(in_features=1472, out_features=384, bias=False)
|
41 |
+
(v): Linear(in_features=1472, out_features=384, bias=False)
|
42 |
+
(o): Linear(in_features=384, out_features=1472, bias=False)
|
43 |
+
)
|
44 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
45 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
46 |
+
)
|
47 |
+
(1): T5LayerFF(
|
48 |
+
(DenseReluDense): T5DenseGatedActDense(
|
49 |
+
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
|
50 |
+
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
|
51 |
+
(wo): Linear(in_features=3584, out_features=1472, bias=False)
|
52 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
53 |
+
(act): NewGELUActivation()
|
54 |
+
)
|
55 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
56 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
57 |
+
)
|
58 |
+
)
|
59 |
+
)
|
60 |
+
)
|
61 |
+
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
62 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
63 |
+
)
|
64 |
+
)
|
65 |
+
)
|
66 |
+
(locked_dropout): LockedDropout(p=0.5)
|
67 |
+
(linear): Linear(in_features=1472, out_features=13, bias=True)
|
68 |
+
(loss_function): CrossEntropyLoss()
|
69 |
+
)"
|
70 |
+
2023-10-14 02:56:52,968 ----------------------------------------------------------------------------------------------------
|
71 |
+
2023-10-14 02:56:52,969 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
|
72 |
+
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
|
73 |
+
2023-10-14 02:56:52,969 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-14 02:56:52,969 Train: 14465 sentences
|
75 |
+
2023-10-14 02:56:52,969 (train_with_dev=False, train_with_test=False)
|
76 |
+
2023-10-14 02:56:52,969 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-14 02:56:52,969 Training Params:
|
78 |
+
2023-10-14 02:56:52,969 - learning_rate: "0.00015"
|
79 |
+
2023-10-14 02:56:52,969 - mini_batch_size: "4"
|
80 |
+
2023-10-14 02:56:52,969 - max_epochs: "10"
|
81 |
+
2023-10-14 02:56:52,969 - shuffle: "True"
|
82 |
+
2023-10-14 02:56:52,969 ----------------------------------------------------------------------------------------------------
|
83 |
+
2023-10-14 02:56:52,970 Plugins:
|
84 |
+
2023-10-14 02:56:52,970 - TensorboardLogger
|
85 |
+
2023-10-14 02:56:52,970 - LinearScheduler | warmup_fraction: '0.1'
|
86 |
+
2023-10-14 02:56:52,970 ----------------------------------------------------------------------------------------------------
|
87 |
+
2023-10-14 02:56:52,970 Final evaluation on model from best epoch (best-model.pt)
|
88 |
+
2023-10-14 02:56:52,970 - metric: "('micro avg', 'f1-score')"
|
89 |
+
2023-10-14 02:56:52,970 ----------------------------------------------------------------------------------------------------
|
90 |
+
2023-10-14 02:56:52,970 Computation:
|
91 |
+
2023-10-14 02:56:52,970 - compute on device: cuda:0
|
92 |
+
2023-10-14 02:56:52,970 - embedding storage: none
|
93 |
+
2023-10-14 02:56:52,970 ----------------------------------------------------------------------------------------------------
|
94 |
+
2023-10-14 02:56:52,970 Model training base path: "hmbench-letemps/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-3"
|
95 |
+
2023-10-14 02:56:52,970 ----------------------------------------------------------------------------------------------------
|
96 |
+
2023-10-14 02:56:52,970 ----------------------------------------------------------------------------------------------------
|
97 |
+
2023-10-14 02:56:52,971 Logging anything other than scalars to TensorBoard is currently not supported.
|
98 |
+
2023-10-14 02:58:32,206 epoch 1 - iter 361/3617 - loss 2.48047885 - time (sec): 99.23 - samples/sec: 380.07 - lr: 0.000015 - momentum: 0.000000
|
99 |
+
2023-10-14 03:00:10,564 epoch 1 - iter 722/3617 - loss 2.10182133 - time (sec): 197.59 - samples/sec: 376.99 - lr: 0.000030 - momentum: 0.000000
|
100 |
+
2023-10-14 03:01:51,336 epoch 1 - iter 1083/3617 - loss 1.65225929 - time (sec): 298.36 - samples/sec: 378.51 - lr: 0.000045 - momentum: 0.000000
|
101 |
+
2023-10-14 03:03:32,329 epoch 1 - iter 1444/3617 - loss 1.30928949 - time (sec): 399.36 - samples/sec: 378.96 - lr: 0.000060 - momentum: 0.000000
|
102 |
+
2023-10-14 03:05:11,749 epoch 1 - iter 1805/3617 - loss 1.08702017 - time (sec): 498.78 - samples/sec: 378.85 - lr: 0.000075 - momentum: 0.000000
|
103 |
+
2023-10-14 03:06:53,731 epoch 1 - iter 2166/3617 - loss 0.93505989 - time (sec): 600.76 - samples/sec: 377.19 - lr: 0.000090 - momentum: 0.000000
|
104 |
+
2023-10-14 03:08:31,230 epoch 1 - iter 2527/3617 - loss 0.82602626 - time (sec): 698.26 - samples/sec: 376.94 - lr: 0.000105 - momentum: 0.000000
|
105 |
+
2023-10-14 03:10:07,444 epoch 1 - iter 2888/3617 - loss 0.74069780 - time (sec): 794.47 - samples/sec: 379.05 - lr: 0.000120 - momentum: 0.000000
|
106 |
+
2023-10-14 03:11:45,292 epoch 1 - iter 3249/3617 - loss 0.66755548 - time (sec): 892.32 - samples/sec: 381.80 - lr: 0.000135 - momentum: 0.000000
|
107 |
+
2023-10-14 03:13:23,530 epoch 1 - iter 3610/3617 - loss 0.61237021 - time (sec): 990.56 - samples/sec: 382.89 - lr: 0.000150 - momentum: 0.000000
|
108 |
+
2023-10-14 03:13:25,225 ----------------------------------------------------------------------------------------------------
|
109 |
+
2023-10-14 03:13:25,225 EPOCH 1 done: loss 0.6115 - lr: 0.000150
|
110 |
+
2023-10-14 03:14:03,125 DEV : loss 0.12874433398246765 - f1-score (micro avg) 0.6138
|
111 |
+
2023-10-14 03:14:03,190 saving best model
|
112 |
+
2023-10-14 03:14:04,104 ----------------------------------------------------------------------------------------------------
|
113 |
+
2023-10-14 03:15:40,905 epoch 2 - iter 361/3617 - loss 0.09418970 - time (sec): 96.80 - samples/sec: 391.51 - lr: 0.000148 - momentum: 0.000000
|
114 |
+
2023-10-14 03:17:19,253 epoch 2 - iter 722/3617 - loss 0.09519731 - time (sec): 195.15 - samples/sec: 384.46 - lr: 0.000147 - momentum: 0.000000
|
115 |
+
2023-10-14 03:19:03,584 epoch 2 - iter 1083/3617 - loss 0.09418116 - time (sec): 299.48 - samples/sec: 388.15 - lr: 0.000145 - momentum: 0.000000
|
116 |
+
2023-10-14 03:20:44,465 epoch 2 - iter 1444/3617 - loss 0.09280081 - time (sec): 400.36 - samples/sec: 387.09 - lr: 0.000143 - momentum: 0.000000
|
117 |
+
2023-10-14 03:22:24,435 epoch 2 - iter 1805/3617 - loss 0.09367715 - time (sec): 500.33 - samples/sec: 386.16 - lr: 0.000142 - momentum: 0.000000
|
118 |
+
2023-10-14 03:24:00,186 epoch 2 - iter 2166/3617 - loss 0.09233796 - time (sec): 596.08 - samples/sec: 385.19 - lr: 0.000140 - momentum: 0.000000
|
119 |
+
2023-10-14 03:25:37,405 epoch 2 - iter 2527/3617 - loss 0.09099030 - time (sec): 693.30 - samples/sec: 385.19 - lr: 0.000138 - momentum: 0.000000
|
120 |
+
2023-10-14 03:27:18,849 epoch 2 - iter 2888/3617 - loss 0.09098331 - time (sec): 794.74 - samples/sec: 382.89 - lr: 0.000137 - momentum: 0.000000
|
121 |
+
2023-10-14 03:28:58,925 epoch 2 - iter 3249/3617 - loss 0.09046497 - time (sec): 894.82 - samples/sec: 382.77 - lr: 0.000135 - momentum: 0.000000
|
122 |
+
2023-10-14 03:30:35,687 epoch 2 - iter 3610/3617 - loss 0.08998216 - time (sec): 991.58 - samples/sec: 382.45 - lr: 0.000133 - momentum: 0.000000
|
123 |
+
2023-10-14 03:30:37,445 ----------------------------------------------------------------------------------------------------
|
124 |
+
2023-10-14 03:30:37,445 EPOCH 2 done: loss 0.0899 - lr: 0.000133
|
125 |
+
2023-10-14 03:31:21,578 DEV : loss 0.11994253098964691 - f1-score (micro avg) 0.6262
|
126 |
+
2023-10-14 03:31:21,672 saving best model
|
127 |
+
2023-10-14 03:31:24,476 ----------------------------------------------------------------------------------------------------
|
128 |
+
2023-10-14 03:33:17,847 epoch 3 - iter 361/3617 - loss 0.06787226 - time (sec): 113.36 - samples/sec: 330.36 - lr: 0.000132 - momentum: 0.000000
|
129 |
+
2023-10-14 03:35:08,246 epoch 3 - iter 722/3617 - loss 0.06752508 - time (sec): 223.76 - samples/sec: 344.88 - lr: 0.000130 - momentum: 0.000000
|
130 |
+
2023-10-14 03:36:56,698 epoch 3 - iter 1083/3617 - loss 0.06518068 - time (sec): 332.22 - samples/sec: 346.12 - lr: 0.000128 - momentum: 0.000000
|
131 |
+
2023-10-14 03:38:44,830 epoch 3 - iter 1444/3617 - loss 0.06371927 - time (sec): 440.35 - samples/sec: 346.81 - lr: 0.000127 - momentum: 0.000000
|
132 |
+
2023-10-14 03:40:32,739 epoch 3 - iter 1805/3617 - loss 0.06387658 - time (sec): 548.26 - samples/sec: 350.40 - lr: 0.000125 - momentum: 0.000000
|
133 |
+
2023-10-14 03:42:18,558 epoch 3 - iter 2166/3617 - loss 0.06346408 - time (sec): 654.08 - samples/sec: 352.37 - lr: 0.000123 - momentum: 0.000000
|
134 |
+
2023-10-14 03:44:05,888 epoch 3 - iter 2527/3617 - loss 0.06459000 - time (sec): 761.41 - samples/sec: 350.21 - lr: 0.000122 - momentum: 0.000000
|
135 |
+
2023-10-14 03:45:55,184 epoch 3 - iter 2888/3617 - loss 0.06401459 - time (sec): 870.70 - samples/sec: 348.70 - lr: 0.000120 - momentum: 0.000000
|
136 |
+
2023-10-14 03:47:39,172 epoch 3 - iter 3249/3617 - loss 0.06394856 - time (sec): 974.69 - samples/sec: 349.27 - lr: 0.000118 - momentum: 0.000000
|
137 |
+
2023-10-14 03:49:25,723 epoch 3 - iter 3610/3617 - loss 0.06409756 - time (sec): 1081.24 - samples/sec: 350.60 - lr: 0.000117 - momentum: 0.000000
|
138 |
+
2023-10-14 03:49:27,697 ----------------------------------------------------------------------------------------------------
|
139 |
+
2023-10-14 03:49:27,697 EPOCH 3 done: loss 0.0640 - lr: 0.000117
|
140 |
+
2023-10-14 03:50:10,253 DEV : loss 0.1677953451871872 - f1-score (micro avg) 0.6461
|
141 |
+
2023-10-14 03:50:10,312 saving best model
|
142 |
+
2023-10-14 03:50:13,061 ----------------------------------------------------------------------------------------------------
|
143 |
+
2023-10-14 03:52:00,514 epoch 4 - iter 361/3617 - loss 0.04549276 - time (sec): 107.45 - samples/sec: 340.97 - lr: 0.000115 - momentum: 0.000000
|
144 |
+
2023-10-14 03:53:47,270 epoch 4 - iter 722/3617 - loss 0.04285214 - time (sec): 214.20 - samples/sec: 351.22 - lr: 0.000113 - momentum: 0.000000
|
145 |
+
2023-10-14 03:55:39,687 epoch 4 - iter 1083/3617 - loss 0.04301779 - time (sec): 326.62 - samples/sec: 345.40 - lr: 0.000112 - momentum: 0.000000
|
146 |
+
2023-10-14 03:57:21,761 epoch 4 - iter 1444/3617 - loss 0.04227254 - time (sec): 428.70 - samples/sec: 349.22 - lr: 0.000110 - momentum: 0.000000
|
147 |
+
2023-10-14 03:59:08,557 epoch 4 - iter 1805/3617 - loss 0.04271265 - time (sec): 535.49 - samples/sec: 350.63 - lr: 0.000108 - momentum: 0.000000
|
148 |
+
2023-10-14 04:00:56,545 epoch 4 - iter 2166/3617 - loss 0.04401912 - time (sec): 643.48 - samples/sec: 353.09 - lr: 0.000107 - momentum: 0.000000
|
149 |
+
2023-10-14 04:02:49,728 epoch 4 - iter 2527/3617 - loss 0.04552488 - time (sec): 756.66 - samples/sec: 351.47 - lr: 0.000105 - momentum: 0.000000
|
150 |
+
2023-10-14 04:04:38,740 epoch 4 - iter 2888/3617 - loss 0.04620049 - time (sec): 865.67 - samples/sec: 350.11 - lr: 0.000103 - momentum: 0.000000
|
151 |
+
2023-10-14 04:06:26,364 epoch 4 - iter 3249/3617 - loss 0.04661299 - time (sec): 973.30 - samples/sec: 351.46 - lr: 0.000102 - momentum: 0.000000
|
152 |
+
2023-10-14 04:08:08,784 epoch 4 - iter 3610/3617 - loss 0.04677504 - time (sec): 1075.72 - samples/sec: 352.63 - lr: 0.000100 - momentum: 0.000000
|
153 |
+
2023-10-14 04:08:10,581 ----------------------------------------------------------------------------------------------------
|
154 |
+
2023-10-14 04:08:10,581 EPOCH 4 done: loss 0.0467 - lr: 0.000100
|
155 |
+
2023-10-14 04:08:52,500 DEV : loss 0.2164839208126068 - f1-score (micro avg) 0.6366
|
156 |
+
2023-10-14 04:08:52,567 ----------------------------------------------------------------------------------------------------
|
157 |
+
2023-10-14 04:10:41,619 epoch 5 - iter 361/3617 - loss 0.02826111 - time (sec): 109.05 - samples/sec: 352.78 - lr: 0.000098 - momentum: 0.000000
|
158 |
+
2023-10-14 04:12:31,812 epoch 5 - iter 722/3617 - loss 0.02975887 - time (sec): 219.24 - samples/sec: 348.59 - lr: 0.000097 - momentum: 0.000000
|
159 |
+
2023-10-14 04:14:13,673 epoch 5 - iter 1083/3617 - loss 0.03117931 - time (sec): 321.10 - samples/sec: 353.89 - lr: 0.000095 - momentum: 0.000000
|
160 |
+
2023-10-14 04:15:59,664 epoch 5 - iter 1444/3617 - loss 0.03114051 - time (sec): 427.09 - samples/sec: 351.78 - lr: 0.000093 - momentum: 0.000000
|
161 |
+
2023-10-14 04:17:51,747 epoch 5 - iter 1805/3617 - loss 0.03157137 - time (sec): 539.18 - samples/sec: 350.38 - lr: 0.000092 - momentum: 0.000000
|
162 |
+
2023-10-14 04:19:39,699 epoch 5 - iter 2166/3617 - loss 0.03117937 - time (sec): 647.13 - samples/sec: 349.96 - lr: 0.000090 - momentum: 0.000000
|
163 |
+
2023-10-14 04:21:21,870 epoch 5 - iter 2527/3617 - loss 0.03194759 - time (sec): 749.30 - samples/sec: 350.89 - lr: 0.000088 - momentum: 0.000000
|
164 |
+
2023-10-14 04:23:09,331 epoch 5 - iter 2888/3617 - loss 0.03227297 - time (sec): 856.76 - samples/sec: 351.11 - lr: 0.000087 - momentum: 0.000000
|
165 |
+
2023-10-14 04:24:50,326 epoch 5 - iter 3249/3617 - loss 0.03237174 - time (sec): 957.76 - samples/sec: 355.14 - lr: 0.000085 - momentum: 0.000000
|
166 |
+
2023-10-14 04:26:33,987 epoch 5 - iter 3610/3617 - loss 0.03258639 - time (sec): 1061.42 - samples/sec: 357.40 - lr: 0.000083 - momentum: 0.000000
|
167 |
+
2023-10-14 04:26:35,865 ----------------------------------------------------------------------------------------------------
|
168 |
+
2023-10-14 04:26:35,865 EPOCH 5 done: loss 0.0326 - lr: 0.000083
|
169 |
+
2023-10-14 04:27:17,687 DEV : loss 0.23494853079319 - f1-score (micro avg) 0.641
|
170 |
+
2023-10-14 04:27:17,752 ----------------------------------------------------------------------------------------------------
|
171 |
+
2023-10-14 04:29:14,894 epoch 6 - iter 361/3617 - loss 0.01939447 - time (sec): 117.14 - samples/sec: 335.01 - lr: 0.000082 - momentum: 0.000000
|
172 |
+
2023-10-14 04:30:58,369 epoch 6 - iter 722/3617 - loss 0.01924277 - time (sec): 220.61 - samples/sec: 349.27 - lr: 0.000080 - momentum: 0.000000
|
173 |
+
2023-10-14 04:32:36,989 epoch 6 - iter 1083/3617 - loss 0.02053705 - time (sec): 319.23 - samples/sec: 358.43 - lr: 0.000078 - momentum: 0.000000
|
174 |
+
2023-10-14 04:34:18,846 epoch 6 - iter 1444/3617 - loss 0.02164758 - time (sec): 421.09 - samples/sec: 358.61 - lr: 0.000077 - momentum: 0.000000
|
175 |
+
2023-10-14 04:36:07,097 epoch 6 - iter 1805/3617 - loss 0.02255132 - time (sec): 529.34 - samples/sec: 354.80 - lr: 0.000075 - momentum: 0.000000
|
176 |
+
2023-10-14 04:37:50,944 epoch 6 - iter 2166/3617 - loss 0.02251730 - time (sec): 633.19 - samples/sec: 355.87 - lr: 0.000073 - momentum: 0.000000
|
177 |
+
2023-10-14 04:39:38,861 epoch 6 - iter 2527/3617 - loss 0.02245883 - time (sec): 741.11 - samples/sec: 357.56 - lr: 0.000072 - momentum: 0.000000
|
178 |
+
2023-10-14 04:41:21,087 epoch 6 - iter 2888/3617 - loss 0.02197406 - time (sec): 843.33 - samples/sec: 360.49 - lr: 0.000070 - momentum: 0.000000
|
179 |
+
2023-10-14 04:43:02,150 epoch 6 - iter 3249/3617 - loss 0.02290527 - time (sec): 944.40 - samples/sec: 360.40 - lr: 0.000068 - momentum: 0.000000
|
180 |
+
2023-10-14 04:44:46,979 epoch 6 - iter 3610/3617 - loss 0.02272655 - time (sec): 1049.22 - samples/sec: 361.29 - lr: 0.000067 - momentum: 0.000000
|
181 |
+
2023-10-14 04:44:48,993 ----------------------------------------------------------------------------------------------------
|
182 |
+
2023-10-14 04:44:48,994 EPOCH 6 done: loss 0.0227 - lr: 0.000067
|
183 |
+
2023-10-14 04:45:30,501 DEV : loss 0.28848496079444885 - f1-score (micro avg) 0.6514
|
184 |
+
2023-10-14 04:45:30,570 saving best model
|
185 |
+
2023-10-14 04:45:35,631 ----------------------------------------------------------------------------------------------------
|
186 |
+
2023-10-14 04:47:22,333 epoch 7 - iter 361/3617 - loss 0.01165935 - time (sec): 106.69 - samples/sec: 359.92 - lr: 0.000065 - momentum: 0.000000
|
187 |
+
2023-10-14 04:49:03,875 epoch 7 - iter 722/3617 - loss 0.01137913 - time (sec): 208.23 - samples/sec: 365.84 - lr: 0.000063 - momentum: 0.000000
|
188 |
+
2023-10-14 04:50:48,226 epoch 7 - iter 1083/3617 - loss 0.01305213 - time (sec): 312.58 - samples/sec: 363.16 - lr: 0.000062 - momentum: 0.000000
|
189 |
+
2023-10-14 04:52:34,504 epoch 7 - iter 1444/3617 - loss 0.01287060 - time (sec): 418.86 - samples/sec: 365.53 - lr: 0.000060 - momentum: 0.000000
|
190 |
+
2023-10-14 04:54:21,015 epoch 7 - iter 1805/3617 - loss 0.01312161 - time (sec): 525.37 - samples/sec: 362.68 - lr: 0.000058 - momentum: 0.000000
|
191 |
+
2023-10-14 04:56:05,373 epoch 7 - iter 2166/3617 - loss 0.01362263 - time (sec): 629.73 - samples/sec: 362.43 - lr: 0.000057 - momentum: 0.000000
|
192 |
+
2023-10-14 04:57:53,808 epoch 7 - iter 2527/3617 - loss 0.01454342 - time (sec): 738.16 - samples/sec: 361.92 - lr: 0.000055 - momentum: 0.000000
|
193 |
+
2023-10-14 04:59:36,320 epoch 7 - iter 2888/3617 - loss 0.01455589 - time (sec): 840.68 - samples/sec: 362.15 - lr: 0.000053 - momentum: 0.000000
|
194 |
+
2023-10-14 05:01:16,947 epoch 7 - iter 3249/3617 - loss 0.01486078 - time (sec): 941.30 - samples/sec: 363.83 - lr: 0.000052 - momentum: 0.000000
|
195 |
+
2023-10-14 05:03:01,921 epoch 7 - iter 3610/3617 - loss 0.01483858 - time (sec): 1046.28 - samples/sec: 362.57 - lr: 0.000050 - momentum: 0.000000
|
196 |
+
2023-10-14 05:03:03,721 ----------------------------------------------------------------------------------------------------
|
197 |
+
2023-10-14 05:03:03,721 EPOCH 7 done: loss 0.0148 - lr: 0.000050
|
198 |
+
2023-10-14 05:03:46,800 DEV : loss 0.3004520535469055 - f1-score (micro avg) 0.6474
|
199 |
+
2023-10-14 05:03:46,866 ----------------------------------------------------------------------------------------------------
|
200 |
+
2023-10-14 05:05:31,450 epoch 8 - iter 361/3617 - loss 0.00618969 - time (sec): 104.58 - samples/sec: 355.17 - lr: 0.000048 - momentum: 0.000000
|
201 |
+
2023-10-14 05:07:14,817 epoch 8 - iter 722/3617 - loss 0.00964368 - time (sec): 207.95 - samples/sec: 361.93 - lr: 0.000047 - momentum: 0.000000
|
202 |
+
2023-10-14 05:09:01,458 epoch 8 - iter 1083/3617 - loss 0.01105306 - time (sec): 314.59 - samples/sec: 364.87 - lr: 0.000045 - momentum: 0.000000
|
203 |
+
2023-10-14 05:10:45,447 epoch 8 - iter 1444/3617 - loss 0.01059372 - time (sec): 418.58 - samples/sec: 364.30 - lr: 0.000043 - momentum: 0.000000
|
204 |
+
2023-10-14 05:12:30,524 epoch 8 - iter 1805/3617 - loss 0.01012078 - time (sec): 523.66 - samples/sec: 365.47 - lr: 0.000042 - momentum: 0.000000
|
205 |
+
2023-10-14 05:14:14,804 epoch 8 - iter 2166/3617 - loss 0.00963085 - time (sec): 627.94 - samples/sec: 364.14 - lr: 0.000040 - momentum: 0.000000
|
206 |
+
2023-10-14 05:16:02,304 epoch 8 - iter 2527/3617 - loss 0.00979590 - time (sec): 735.44 - samples/sec: 362.59 - lr: 0.000038 - momentum: 0.000000
|
207 |
+
2023-10-14 05:17:45,665 epoch 8 - iter 2888/3617 - loss 0.00961432 - time (sec): 838.80 - samples/sec: 362.84 - lr: 0.000037 - momentum: 0.000000
|
208 |
+
2023-10-14 05:19:28,394 epoch 8 - iter 3249/3617 - loss 0.00997350 - time (sec): 941.53 - samples/sec: 362.99 - lr: 0.000035 - momentum: 0.000000
|
209 |
+
2023-10-14 05:21:12,120 epoch 8 - iter 3610/3617 - loss 0.00969534 - time (sec): 1045.25 - samples/sec: 363.06 - lr: 0.000033 - momentum: 0.000000
|
210 |
+
2023-10-14 05:21:13,905 ----------------------------------------------------------------------------------------------------
|
211 |
+
2023-10-14 05:21:13,905 EPOCH 8 done: loss 0.0097 - lr: 0.000033
|
212 |
+
2023-10-14 05:21:55,993 DEV : loss 0.33705711364746094 - f1-score (micro avg) 0.6492
|
213 |
+
2023-10-14 05:21:56,061 ----------------------------------------------------------------------------------------------------
|
214 |
+
2023-10-14 05:23:41,737 epoch 9 - iter 361/3617 - loss 0.00768336 - time (sec): 105.67 - samples/sec: 366.27 - lr: 0.000032 - momentum: 0.000000
|
215 |
+
2023-10-14 05:25:32,318 epoch 9 - iter 722/3617 - loss 0.00809368 - time (sec): 216.25 - samples/sec: 361.32 - lr: 0.000030 - momentum: 0.000000
|
216 |
+
2023-10-14 05:27:20,923 epoch 9 - iter 1083/3617 - loss 0.00801755 - time (sec): 324.86 - samples/sec: 355.90 - lr: 0.000028 - momentum: 0.000000
|
217 |
+
2023-10-14 05:29:02,262 epoch 9 - iter 1444/3617 - loss 0.00794549 - time (sec): 426.20 - samples/sec: 361.14 - lr: 0.000027 - momentum: 0.000000
|
218 |
+
2023-10-14 05:30:41,183 epoch 9 - iter 1805/3617 - loss 0.00771750 - time (sec): 525.12 - samples/sec: 364.12 - lr: 0.000025 - momentum: 0.000000
|
219 |
+
2023-10-14 05:32:20,956 epoch 9 - iter 2166/3617 - loss 0.00754007 - time (sec): 624.89 - samples/sec: 367.42 - lr: 0.000023 - momentum: 0.000000
|
220 |
+
2023-10-14 05:34:00,453 epoch 9 - iter 2527/3617 - loss 0.00713992 - time (sec): 724.39 - samples/sec: 369.15 - lr: 0.000022 - momentum: 0.000000
|
221 |
+
2023-10-14 05:35:41,205 epoch 9 - iter 2888/3617 - loss 0.00704793 - time (sec): 825.14 - samples/sec: 368.97 - lr: 0.000020 - momentum: 0.000000
|
222 |
+
2023-10-14 05:37:23,105 epoch 9 - iter 3249/3617 - loss 0.00699019 - time (sec): 927.04 - samples/sec: 367.49 - lr: 0.000018 - momentum: 0.000000
|
223 |
+
2023-10-14 05:39:03,150 epoch 9 - iter 3610/3617 - loss 0.00658969 - time (sec): 1027.09 - samples/sec: 369.22 - lr: 0.000017 - momentum: 0.000000
|
224 |
+
2023-10-14 05:39:05,141 ----------------------------------------------------------------------------------------------------
|
225 |
+
2023-10-14 05:39:05,142 EPOCH 9 done: loss 0.0066 - lr: 0.000017
|
226 |
+
2023-10-14 05:39:46,293 DEV : loss 0.3554496467113495 - f1-score (micro avg) 0.65
|
227 |
+
2023-10-14 05:39:46,351 ----------------------------------------------------------------------------------------------------
|
228 |
+
2023-10-14 05:41:28,259 epoch 10 - iter 361/3617 - loss 0.00317370 - time (sec): 101.91 - samples/sec: 370.33 - lr: 0.000015 - momentum: 0.000000
|
229 |
+
2023-10-14 05:43:12,503 epoch 10 - iter 722/3617 - loss 0.00395930 - time (sec): 206.15 - samples/sec: 358.82 - lr: 0.000013 - momentum: 0.000000
|
230 |
+
2023-10-14 05:44:56,641 epoch 10 - iter 1083/3617 - loss 0.00484872 - time (sec): 310.29 - samples/sec: 362.49 - lr: 0.000012 - momentum: 0.000000
|
231 |
+
2023-10-14 05:46:40,943 epoch 10 - iter 1444/3617 - loss 0.00441603 - time (sec): 414.59 - samples/sec: 360.25 - lr: 0.000010 - momentum: 0.000000
|
232 |
+
2023-10-14 05:48:22,688 epoch 10 - iter 1805/3617 - loss 0.00440111 - time (sec): 516.33 - samples/sec: 364.19 - lr: 0.000008 - momentum: 0.000000
|
233 |
+
2023-10-14 05:50:03,498 epoch 10 - iter 2166/3617 - loss 0.00463574 - time (sec): 617.14 - samples/sec: 365.23 - lr: 0.000007 - momentum: 0.000000
|
234 |
+
2023-10-14 05:51:45,733 epoch 10 - iter 2527/3617 - loss 0.00475224 - time (sec): 719.38 - samples/sec: 368.15 - lr: 0.000005 - momentum: 0.000000
|
235 |
+
2023-10-14 05:53:30,970 epoch 10 - iter 2888/3617 - loss 0.00477992 - time (sec): 824.62 - samples/sec: 366.33 - lr: 0.000003 - momentum: 0.000000
|
236 |
+
2023-10-14 05:55:21,600 epoch 10 - iter 3249/3617 - loss 0.00469963 - time (sec): 935.25 - samples/sec: 364.85 - lr: 0.000002 - momentum: 0.000000
|
237 |
+
2023-10-14 05:57:08,911 epoch 10 - iter 3610/3617 - loss 0.00449982 - time (sec): 1042.56 - samples/sec: 363.50 - lr: 0.000000 - momentum: 0.000000
|
238 |
+
2023-10-14 05:57:11,042 ----------------------------------------------------------------------------------------------------
|
239 |
+
2023-10-14 05:57:11,043 EPOCH 10 done: loss 0.0045 - lr: 0.000000
|
240 |
+
2023-10-14 05:57:55,357 DEV : loss 0.3686419725418091 - f1-score (micro avg) 0.654
|
241 |
+
2023-10-14 05:57:55,427 saving best model
|
242 |
+
2023-10-14 05:58:03,839 ----------------------------------------------------------------------------------------------------
|
243 |
+
2023-10-14 05:58:03,841 Loading model from best epoch ...
|
244 |
+
2023-10-14 05:58:08,061 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
|
245 |
+
2023-10-14 05:59:10,367
|
246 |
+
Results:
|
247 |
+
- F-score (micro) 0.6565
|
248 |
+
- F-score (macro) 0.5195
|
249 |
+
- Accuracy 0.5017
|
250 |
+
|
251 |
+
By class:
|
252 |
+
precision recall f1-score support
|
253 |
+
|
254 |
+
loc 0.6609 0.7750 0.7134 591
|
255 |
+
pers 0.5807 0.7255 0.6451 357
|
256 |
+
org 0.2295 0.1772 0.2000 79
|
257 |
+
|
258 |
+
micro avg 0.6092 0.7118 0.6565 1027
|
259 |
+
macro avg 0.4904 0.5592 0.5195 1027
|
260 |
+
weighted avg 0.5998 0.7118 0.6502 1027
|
261 |
+
|
262 |
+
2023-10-14 05:59:10,367 ----------------------------------------------------------------------------------------------------
|