Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697677859.46dc0c540dd0.3802.12 +3 -0
- test.tsv +0 -0
- training.log +245 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8f55a702588902480e78d04ab437dba6e98ec2b2f704035a08562a524ca8c2d6
|
3 |
+
size 19045922
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 01:11:58 0.0000 0.7293 0.1804 0.2137 0.1751 0.1925 0.1094
|
3 |
+
2 01:12:59 0.0000 0.1857 0.1723 0.3455 0.4336 0.3846 0.2474
|
4 |
+
3 01:14:00 0.0000 0.1578 0.1696 0.3595 0.4931 0.4158 0.2723
|
5 |
+
4 01:15:01 0.0000 0.1464 0.1666 0.3991 0.5114 0.4483 0.2986
|
6 |
+
5 01:16:02 0.0000 0.1351 0.1814 0.3782 0.5824 0.4586 0.3092
|
7 |
+
6 01:17:02 0.0000 0.1264 0.1841 0.3907 0.6030 0.4741 0.3210
|
8 |
+
7 01:18:03 0.0000 0.1222 0.1875 0.4058 0.5812 0.4779 0.3227
|
9 |
+
8 01:19:04 0.0000 0.1166 0.1948 0.4165 0.5824 0.4857 0.3299
|
10 |
+
9 01:20:05 0.0000 0.1135 0.1936 0.4142 0.5801 0.4833 0.3275
|
11 |
+
10 01:21:05 0.0000 0.1097 0.1968 0.4118 0.5847 0.4832 0.3278
|
runs/events.out.tfevents.1697677859.46dc0c540dd0.3802.12
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c38de5dfc6ca4c9f3b0675d7b92556d7e81c895c9077cb17acc37633b2c775ab
|
3 |
+
size 2030580
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,245 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-19 01:10:59,040 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-19 01:10:59,040 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 128)
|
7 |
+
(position_embeddings): Embedding(512, 128)
|
8 |
+
(token_type_embeddings): Embedding(2, 128)
|
9 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-1): 2 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=128, out_features=128, bias=True)
|
18 |
+
(key): Linear(in_features=128, out_features=128, bias=True)
|
19 |
+
(value): Linear(in_features=128, out_features=128, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=128, out_features=128, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=128, out_features=512, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=512, out_features=128, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=128, out_features=128, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=128, out_features=13, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-19 01:10:59,040 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-19 01:10:59,040 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
|
53 |
+
2023-10-19 01:10:59,040 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-19 01:10:59,040 Train: 14465 sentences
|
55 |
+
2023-10-19 01:10:59,040 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-19 01:10:59,040 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-19 01:10:59,040 Training Params:
|
58 |
+
2023-10-19 01:10:59,040 - learning_rate: "3e-05"
|
59 |
+
2023-10-19 01:10:59,040 - mini_batch_size: "4"
|
60 |
+
2023-10-19 01:10:59,040 - max_epochs: "10"
|
61 |
+
2023-10-19 01:10:59,040 - shuffle: "True"
|
62 |
+
2023-10-19 01:10:59,040 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-19 01:10:59,040 Plugins:
|
64 |
+
2023-10-19 01:10:59,040 - TensorboardLogger
|
65 |
+
2023-10-19 01:10:59,040 - LinearScheduler | warmup_fraction: '0.1'
|
66 |
+
2023-10-19 01:10:59,040 ----------------------------------------------------------------------------------------------------
|
67 |
+
2023-10-19 01:10:59,040 Final evaluation on model from best epoch (best-model.pt)
|
68 |
+
2023-10-19 01:10:59,040 - metric: "('micro avg', 'f1-score')"
|
69 |
+
2023-10-19 01:10:59,041 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-19 01:10:59,041 Computation:
|
71 |
+
2023-10-19 01:10:59,041 - compute on device: cuda:0
|
72 |
+
2023-10-19 01:10:59,041 - embedding storage: none
|
73 |
+
2023-10-19 01:10:59,041 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-19 01:10:59,041 Model training base path: "hmbench-letemps/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
|
75 |
+
2023-10-19 01:10:59,041 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-19 01:10:59,041 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-19 01:10:59,041 Logging anything other than scalars to TensorBoard is currently not supported.
|
78 |
+
2023-10-19 01:11:04,907 epoch 1 - iter 361/3617 - loss 3.12252186 - time (sec): 5.87 - samples/sec: 6145.92 - lr: 0.000003 - momentum: 0.000000
|
79 |
+
2023-10-19 01:11:10,541 epoch 1 - iter 722/3617 - loss 2.41293036 - time (sec): 11.50 - samples/sec: 6537.22 - lr: 0.000006 - momentum: 0.000000
|
80 |
+
2023-10-19 01:11:16,281 epoch 1 - iter 1083/3617 - loss 1.79547173 - time (sec): 17.24 - samples/sec: 6533.93 - lr: 0.000009 - momentum: 0.000000
|
81 |
+
2023-10-19 01:11:21,954 epoch 1 - iter 1444/3617 - loss 1.44599180 - time (sec): 22.91 - samples/sec: 6519.84 - lr: 0.000012 - momentum: 0.000000
|
82 |
+
2023-10-19 01:11:27,623 epoch 1 - iter 1805/3617 - loss 1.22012401 - time (sec): 28.58 - samples/sec: 6522.30 - lr: 0.000015 - momentum: 0.000000
|
83 |
+
2023-10-19 01:11:33,316 epoch 1 - iter 2166/3617 - loss 1.06655197 - time (sec): 34.27 - samples/sec: 6522.49 - lr: 0.000018 - momentum: 0.000000
|
84 |
+
2023-10-19 01:11:39,043 epoch 1 - iter 2527/3617 - loss 0.95272625 - time (sec): 40.00 - samples/sec: 6529.96 - lr: 0.000021 - momentum: 0.000000
|
85 |
+
2023-10-19 01:11:44,824 epoch 1 - iter 2888/3617 - loss 0.85861658 - time (sec): 45.78 - samples/sec: 6579.78 - lr: 0.000024 - momentum: 0.000000
|
86 |
+
2023-10-19 01:11:50,477 epoch 1 - iter 3249/3617 - loss 0.78754238 - time (sec): 51.44 - samples/sec: 6619.51 - lr: 0.000027 - momentum: 0.000000
|
87 |
+
2023-10-19 01:11:56,201 epoch 1 - iter 3610/3617 - loss 0.73000898 - time (sec): 57.16 - samples/sec: 6635.44 - lr: 0.000030 - momentum: 0.000000
|
88 |
+
2023-10-19 01:11:56,311 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-19 01:11:56,311 EPOCH 1 done: loss 0.7293 - lr: 0.000030
|
90 |
+
2023-10-19 01:11:58,584 DEV : loss 0.1803603321313858 - f1-score (micro avg) 0.1925
|
91 |
+
2023-10-19 01:11:58,611 saving best model
|
92 |
+
2023-10-19 01:11:58,640 ----------------------------------------------------------------------------------------------------
|
93 |
+
2023-10-19 01:12:04,265 epoch 2 - iter 361/3617 - loss 0.20098825 - time (sec): 5.62 - samples/sec: 6528.17 - lr: 0.000030 - momentum: 0.000000
|
94 |
+
2023-10-19 01:12:09,986 epoch 2 - iter 722/3617 - loss 0.20199769 - time (sec): 11.35 - samples/sec: 6570.12 - lr: 0.000029 - momentum: 0.000000
|
95 |
+
2023-10-19 01:12:15,686 epoch 2 - iter 1083/3617 - loss 0.19444887 - time (sec): 17.05 - samples/sec: 6553.53 - lr: 0.000029 - momentum: 0.000000
|
96 |
+
2023-10-19 01:12:21,265 epoch 2 - iter 1444/3617 - loss 0.19472453 - time (sec): 22.62 - samples/sec: 6574.65 - lr: 0.000029 - momentum: 0.000000
|
97 |
+
2023-10-19 01:12:26,799 epoch 2 - iter 1805/3617 - loss 0.19141549 - time (sec): 28.16 - samples/sec: 6668.61 - lr: 0.000028 - momentum: 0.000000
|
98 |
+
2023-10-19 01:12:32,506 epoch 2 - iter 2166/3617 - loss 0.18942948 - time (sec): 33.87 - samples/sec: 6651.96 - lr: 0.000028 - momentum: 0.000000
|
99 |
+
2023-10-19 01:12:38,393 epoch 2 - iter 2527/3617 - loss 0.19177342 - time (sec): 39.75 - samples/sec: 6633.01 - lr: 0.000028 - momentum: 0.000000
|
100 |
+
2023-10-19 01:12:44,107 epoch 2 - iter 2888/3617 - loss 0.18890055 - time (sec): 45.47 - samples/sec: 6637.92 - lr: 0.000027 - momentum: 0.000000
|
101 |
+
2023-10-19 01:12:49,797 epoch 2 - iter 3249/3617 - loss 0.18726829 - time (sec): 51.16 - samples/sec: 6664.34 - lr: 0.000027 - momentum: 0.000000
|
102 |
+
2023-10-19 01:12:55,492 epoch 2 - iter 3610/3617 - loss 0.18564262 - time (sec): 56.85 - samples/sec: 6673.24 - lr: 0.000027 - momentum: 0.000000
|
103 |
+
2023-10-19 01:12:55,588 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-19 01:12:55,588 EPOCH 2 done: loss 0.1857 - lr: 0.000027
|
105 |
+
2023-10-19 01:12:59,509 DEV : loss 0.17232035100460052 - f1-score (micro avg) 0.3846
|
106 |
+
2023-10-19 01:12:59,536 saving best model
|
107 |
+
2023-10-19 01:12:59,569 ----------------------------------------------------------------------------------------------------
|
108 |
+
2023-10-19 01:13:05,350 epoch 3 - iter 361/3617 - loss 0.15068487 - time (sec): 5.78 - samples/sec: 6657.47 - lr: 0.000026 - momentum: 0.000000
|
109 |
+
2023-10-19 01:13:11,111 epoch 3 - iter 722/3617 - loss 0.16512105 - time (sec): 11.54 - samples/sec: 6531.24 - lr: 0.000026 - momentum: 0.000000
|
110 |
+
2023-10-19 01:13:16,903 epoch 3 - iter 1083/3617 - loss 0.16734512 - time (sec): 17.33 - samples/sec: 6649.78 - lr: 0.000026 - momentum: 0.000000
|
111 |
+
2023-10-19 01:13:22,599 epoch 3 - iter 1444/3617 - loss 0.16745629 - time (sec): 23.03 - samples/sec: 6608.73 - lr: 0.000025 - momentum: 0.000000
|
112 |
+
2023-10-19 01:13:28,382 epoch 3 - iter 1805/3617 - loss 0.16513383 - time (sec): 28.81 - samples/sec: 6577.01 - lr: 0.000025 - momentum: 0.000000
|
113 |
+
2023-10-19 01:13:34,182 epoch 3 - iter 2166/3617 - loss 0.16581818 - time (sec): 34.61 - samples/sec: 6543.84 - lr: 0.000025 - momentum: 0.000000
|
114 |
+
2023-10-19 01:13:40,005 epoch 3 - iter 2527/3617 - loss 0.16204270 - time (sec): 40.44 - samples/sec: 6531.56 - lr: 0.000024 - momentum: 0.000000
|
115 |
+
2023-10-19 01:13:45,508 epoch 3 - iter 2888/3617 - loss 0.15884163 - time (sec): 45.94 - samples/sec: 6591.42 - lr: 0.000024 - momentum: 0.000000
|
116 |
+
2023-10-19 01:13:51,225 epoch 3 - iter 3249/3617 - loss 0.15690141 - time (sec): 51.66 - samples/sec: 6628.98 - lr: 0.000024 - momentum: 0.000000
|
117 |
+
2023-10-19 01:13:56,955 epoch 3 - iter 3610/3617 - loss 0.15791128 - time (sec): 57.38 - samples/sec: 6609.03 - lr: 0.000023 - momentum: 0.000000
|
118 |
+
2023-10-19 01:13:57,053 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-19 01:13:57,053 EPOCH 3 done: loss 0.1578 - lr: 0.000023
|
120 |
+
2023-10-19 01:14:00,254 DEV : loss 0.16959495842456818 - f1-score (micro avg) 0.4158
|
121 |
+
2023-10-19 01:14:00,282 saving best model
|
122 |
+
2023-10-19 01:14:00,315 ----------------------------------------------------------------------------------------------------
|
123 |
+
2023-10-19 01:14:06,037 epoch 4 - iter 361/3617 - loss 0.13799445 - time (sec): 5.72 - samples/sec: 6598.38 - lr: 0.000023 - momentum: 0.000000
|
124 |
+
2023-10-19 01:14:11,790 epoch 4 - iter 722/3617 - loss 0.13722792 - time (sec): 11.47 - samples/sec: 6581.33 - lr: 0.000023 - momentum: 0.000000
|
125 |
+
2023-10-19 01:14:17,606 epoch 4 - iter 1083/3617 - loss 0.14251973 - time (sec): 17.29 - samples/sec: 6618.47 - lr: 0.000022 - momentum: 0.000000
|
126 |
+
2023-10-19 01:14:23,333 epoch 4 - iter 1444/3617 - loss 0.14215649 - time (sec): 23.02 - samples/sec: 6612.71 - lr: 0.000022 - momentum: 0.000000
|
127 |
+
2023-10-19 01:14:29,047 epoch 4 - iter 1805/3617 - loss 0.14527769 - time (sec): 28.73 - samples/sec: 6567.33 - lr: 0.000022 - momentum: 0.000000
|
128 |
+
2023-10-19 01:14:34,901 epoch 4 - iter 2166/3617 - loss 0.14771726 - time (sec): 34.58 - samples/sec: 6575.48 - lr: 0.000021 - momentum: 0.000000
|
129 |
+
2023-10-19 01:14:40,765 epoch 4 - iter 2527/3617 - loss 0.14620802 - time (sec): 40.45 - samples/sec: 6581.63 - lr: 0.000021 - momentum: 0.000000
|
130 |
+
2023-10-19 01:14:46,464 epoch 4 - iter 2888/3617 - loss 0.14434453 - time (sec): 46.15 - samples/sec: 6577.15 - lr: 0.000021 - momentum: 0.000000
|
131 |
+
2023-10-19 01:14:52,181 epoch 4 - iter 3249/3617 - loss 0.14432907 - time (sec): 51.86 - samples/sec: 6602.27 - lr: 0.000020 - momentum: 0.000000
|
132 |
+
2023-10-19 01:14:57,873 epoch 4 - iter 3610/3617 - loss 0.14625128 - time (sec): 57.56 - samples/sec: 6587.12 - lr: 0.000020 - momentum: 0.000000
|
133 |
+
2023-10-19 01:14:57,984 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-19 01:14:57,984 EPOCH 4 done: loss 0.1464 - lr: 0.000020
|
135 |
+
2023-10-19 01:15:01,868 DEV : loss 0.16661077737808228 - f1-score (micro avg) 0.4483
|
136 |
+
2023-10-19 01:15:01,897 saving best model
|
137 |
+
2023-10-19 01:15:01,929 ----------------------------------------------------------------------------------------------------
|
138 |
+
2023-10-19 01:15:07,423 epoch 5 - iter 361/3617 - loss 0.12987237 - time (sec): 5.49 - samples/sec: 6882.43 - lr: 0.000020 - momentum: 0.000000
|
139 |
+
2023-10-19 01:15:13,383 epoch 5 - iter 722/3617 - loss 0.12949253 - time (sec): 11.45 - samples/sec: 6532.00 - lr: 0.000019 - momentum: 0.000000
|
140 |
+
2023-10-19 01:15:19,029 epoch 5 - iter 1083/3617 - loss 0.13420185 - time (sec): 17.10 - samples/sec: 6506.71 - lr: 0.000019 - momentum: 0.000000
|
141 |
+
2023-10-19 01:15:24,691 epoch 5 - iter 1444/3617 - loss 0.13910024 - time (sec): 22.76 - samples/sec: 6542.09 - lr: 0.000019 - momentum: 0.000000
|
142 |
+
2023-10-19 01:15:30,487 epoch 5 - iter 1805/3617 - loss 0.13856134 - time (sec): 28.56 - samples/sec: 6558.16 - lr: 0.000018 - momentum: 0.000000
|
143 |
+
2023-10-19 01:15:36,168 epoch 5 - iter 2166/3617 - loss 0.13530390 - time (sec): 34.24 - samples/sec: 6550.97 - lr: 0.000018 - momentum: 0.000000
|
144 |
+
2023-10-19 01:15:41,913 epoch 5 - iter 2527/3617 - loss 0.13679379 - time (sec): 39.98 - samples/sec: 6590.17 - lr: 0.000018 - momentum: 0.000000
|
145 |
+
2023-10-19 01:15:47,672 epoch 5 - iter 2888/3617 - loss 0.13624011 - time (sec): 45.74 - samples/sec: 6604.87 - lr: 0.000017 - momentum: 0.000000
|
146 |
+
2023-10-19 01:15:53,434 epoch 5 - iter 3249/3617 - loss 0.13560627 - time (sec): 51.50 - samples/sec: 6616.88 - lr: 0.000017 - momentum: 0.000000
|
147 |
+
2023-10-19 01:15:59,190 epoch 5 - iter 3610/3617 - loss 0.13495541 - time (sec): 57.26 - samples/sec: 6624.56 - lr: 0.000017 - momentum: 0.000000
|
148 |
+
2023-10-19 01:15:59,302 ----------------------------------------------------------------------------------------------------
|
149 |
+
2023-10-19 01:15:59,303 EPOCH 5 done: loss 0.1351 - lr: 0.000017
|
150 |
+
2023-10-19 01:16:02,561 DEV : loss 0.1814231425523758 - f1-score (micro avg) 0.4586
|
151 |
+
2023-10-19 01:16:02,588 saving best model
|
152 |
+
2023-10-19 01:16:02,621 ----------------------------------------------------------------------------------------------------
|
153 |
+
2023-10-19 01:16:08,251 epoch 6 - iter 361/3617 - loss 0.13714181 - time (sec): 5.63 - samples/sec: 6642.75 - lr: 0.000016 - momentum: 0.000000
|
154 |
+
2023-10-19 01:16:13,926 epoch 6 - iter 722/3617 - loss 0.12680325 - time (sec): 11.30 - samples/sec: 6662.60 - lr: 0.000016 - momentum: 0.000000
|
155 |
+
2023-10-19 01:16:19,692 epoch 6 - iter 1083/3617 - loss 0.12500890 - time (sec): 17.07 - samples/sec: 6691.92 - lr: 0.000016 - momentum: 0.000000
|
156 |
+
2023-10-19 01:16:25,275 epoch 6 - iter 1444/3617 - loss 0.12471214 - time (sec): 22.65 - samples/sec: 6643.42 - lr: 0.000015 - momentum: 0.000000
|
157 |
+
2023-10-19 01:16:30,912 epoch 6 - iter 1805/3617 - loss 0.12052615 - time (sec): 28.29 - samples/sec: 6579.21 - lr: 0.000015 - momentum: 0.000000
|
158 |
+
2023-10-19 01:16:36,593 epoch 6 - iter 2166/3617 - loss 0.12279489 - time (sec): 33.97 - samples/sec: 6628.67 - lr: 0.000015 - momentum: 0.000000
|
159 |
+
2023-10-19 01:16:42,367 epoch 6 - iter 2527/3617 - loss 0.12224267 - time (sec): 39.74 - samples/sec: 6630.65 - lr: 0.000014 - momentum: 0.000000
|
160 |
+
2023-10-19 01:16:48,183 epoch 6 - iter 2888/3617 - loss 0.12426780 - time (sec): 45.56 - samples/sec: 6629.71 - lr: 0.000014 - momentum: 0.000000
|
161 |
+
2023-10-19 01:16:53,829 epoch 6 - iter 3249/3617 - loss 0.12488997 - time (sec): 51.21 - samples/sec: 6654.26 - lr: 0.000014 - momentum: 0.000000
|
162 |
+
2023-10-19 01:16:59,084 epoch 6 - iter 3610/3617 - loss 0.12639819 - time (sec): 56.46 - samples/sec: 6717.91 - lr: 0.000013 - momentum: 0.000000
|
163 |
+
2023-10-19 01:16:59,180 ----------------------------------------------------------------------------------------------------
|
164 |
+
2023-10-19 01:16:59,181 EPOCH 6 done: loss 0.1264 - lr: 0.000013
|
165 |
+
2023-10-19 01:17:02,429 DEV : loss 0.1841202825307846 - f1-score (micro avg) 0.4741
|
166 |
+
2023-10-19 01:17:02,456 saving best model
|
167 |
+
2023-10-19 01:17:02,489 ----------------------------------------------------------------------------------------------------
|
168 |
+
2023-10-19 01:17:08,157 epoch 7 - iter 361/3617 - loss 0.12363306 - time (sec): 5.67 - samples/sec: 6708.62 - lr: 0.000013 - momentum: 0.000000
|
169 |
+
2023-10-19 01:17:13,867 epoch 7 - iter 722/3617 - loss 0.12287275 - time (sec): 11.38 - samples/sec: 6576.34 - lr: 0.000013 - momentum: 0.000000
|
170 |
+
2023-10-19 01:17:19,709 epoch 7 - iter 1083/3617 - loss 0.11963241 - time (sec): 17.22 - samples/sec: 6579.90 - lr: 0.000012 - momentum: 0.000000
|
171 |
+
2023-10-19 01:17:25,354 epoch 7 - iter 1444/3617 - loss 0.12218097 - time (sec): 22.86 - samples/sec: 6637.46 - lr: 0.000012 - momentum: 0.000000
|
172 |
+
2023-10-19 01:17:31,082 epoch 7 - iter 1805/3617 - loss 0.12399375 - time (sec): 28.59 - samples/sec: 6639.53 - lr: 0.000012 - momentum: 0.000000
|
173 |
+
2023-10-19 01:17:36,684 epoch 7 - iter 2166/3617 - loss 0.12278498 - time (sec): 34.19 - samples/sec: 6696.17 - lr: 0.000011 - momentum: 0.000000
|
174 |
+
2023-10-19 01:17:43,138 epoch 7 - iter 2527/3617 - loss 0.12015701 - time (sec): 40.65 - samples/sec: 6563.75 - lr: 0.000011 - momentum: 0.000000
|
175 |
+
2023-10-19 01:17:48,760 epoch 7 - iter 2888/3617 - loss 0.11877865 - time (sec): 46.27 - samples/sec: 6593.24 - lr: 0.000011 - momentum: 0.000000
|
176 |
+
2023-10-19 01:17:54,087 epoch 7 - iter 3249/3617 - loss 0.12139837 - time (sec): 51.60 - samples/sec: 6624.35 - lr: 0.000010 - momentum: 0.000000
|
177 |
+
2023-10-19 01:17:59,808 epoch 7 - iter 3610/3617 - loss 0.12214986 - time (sec): 57.32 - samples/sec: 6617.30 - lr: 0.000010 - momentum: 0.000000
|
178 |
+
2023-10-19 01:17:59,917 ----------------------------------------------------------------------------------------------------
|
179 |
+
2023-10-19 01:17:59,917 EPOCH 7 done: loss 0.1222 - lr: 0.000010
|
180 |
+
2023-10-19 01:18:03,123 DEV : loss 0.18754823505878448 - f1-score (micro avg) 0.4779
|
181 |
+
2023-10-19 01:18:03,151 saving best model
|
182 |
+
2023-10-19 01:18:03,186 ----------------------------------------------------------------------------------------------------
|
183 |
+
2023-10-19 01:18:09,163 epoch 8 - iter 361/3617 - loss 0.11552861 - time (sec): 5.98 - samples/sec: 6550.53 - lr: 0.000010 - momentum: 0.000000
|
184 |
+
2023-10-19 01:18:14,914 epoch 8 - iter 722/3617 - loss 0.11948353 - time (sec): 11.73 - samples/sec: 6616.21 - lr: 0.000009 - momentum: 0.000000
|
185 |
+
2023-10-19 01:18:20,735 epoch 8 - iter 1083/3617 - loss 0.11968151 - time (sec): 17.55 - samples/sec: 6671.45 - lr: 0.000009 - momentum: 0.000000
|
186 |
+
2023-10-19 01:18:26,269 epoch 8 - iter 1444/3617 - loss 0.12114434 - time (sec): 23.08 - samples/sec: 6666.92 - lr: 0.000009 - momentum: 0.000000
|
187 |
+
2023-10-19 01:18:32,065 epoch 8 - iter 1805/3617 - loss 0.11664197 - time (sec): 28.88 - samples/sec: 6667.87 - lr: 0.000008 - momentum: 0.000000
|
188 |
+
2023-10-19 01:18:37,854 epoch 8 - iter 2166/3617 - loss 0.11725016 - time (sec): 34.67 - samples/sec: 6598.38 - lr: 0.000008 - momentum: 0.000000
|
189 |
+
2023-10-19 01:18:43,593 epoch 8 - iter 2527/3617 - loss 0.11944440 - time (sec): 40.41 - samples/sec: 6575.91 - lr: 0.000008 - momentum: 0.000000
|
190 |
+
2023-10-19 01:18:49,269 epoch 8 - iter 2888/3617 - loss 0.11884190 - time (sec): 46.08 - samples/sec: 6587.45 - lr: 0.000007 - momentum: 0.000000
|
191 |
+
2023-10-19 01:18:54,994 epoch 8 - iter 3249/3617 - loss 0.11785410 - time (sec): 51.81 - samples/sec: 6583.13 - lr: 0.000007 - momentum: 0.000000
|
192 |
+
2023-10-19 01:19:00,793 epoch 8 - iter 3610/3617 - loss 0.11660685 - time (sec): 57.61 - samples/sec: 6587.20 - lr: 0.000007 - momentum: 0.000000
|
193 |
+
2023-10-19 01:19:00,903 ----------------------------------------------------------------------------------------------------
|
194 |
+
2023-10-19 01:19:00,903 EPOCH 8 done: loss 0.1166 - lr: 0.000007
|
195 |
+
2023-10-19 01:19:04,160 DEV : loss 0.19483603537082672 - f1-score (micro avg) 0.4857
|
196 |
+
2023-10-19 01:19:04,188 saving best model
|
197 |
+
2023-10-19 01:19:04,220 ----------------------------------------------------------------------------------------------------
|
198 |
+
2023-10-19 01:19:10,105 epoch 9 - iter 361/3617 - loss 0.12224075 - time (sec): 5.88 - samples/sec: 6742.34 - lr: 0.000006 - momentum: 0.000000
|
199 |
+
2023-10-19 01:19:15,778 epoch 9 - iter 722/3617 - loss 0.10932434 - time (sec): 11.56 - samples/sec: 6748.35 - lr: 0.000006 - momentum: 0.000000
|
200 |
+
2023-10-19 01:19:21,542 epoch 9 - iter 1083/3617 - loss 0.11140440 - time (sec): 17.32 - samples/sec: 6728.86 - lr: 0.000006 - momentum: 0.000000
|
201 |
+
2023-10-19 01:19:27,301 epoch 9 - iter 1444/3617 - loss 0.10874201 - time (sec): 23.08 - samples/sec: 6632.17 - lr: 0.000005 - momentum: 0.000000
|
202 |
+
2023-10-19 01:19:32,820 epoch 9 - iter 1805/3617 - loss 0.10872015 - time (sec): 28.60 - samples/sec: 6732.84 - lr: 0.000005 - momentum: 0.000000
|
203 |
+
2023-10-19 01:19:38,447 epoch 9 - iter 2166/3617 - loss 0.11133825 - time (sec): 34.23 - samples/sec: 6714.00 - lr: 0.000005 - momentum: 0.000000
|
204 |
+
2023-10-19 01:19:44,220 epoch 9 - iter 2527/3617 - loss 0.11266491 - time (sec): 40.00 - samples/sec: 6670.77 - lr: 0.000004 - momentum: 0.000000
|
205 |
+
2023-10-19 01:19:50,013 epoch 9 - iter 2888/3617 - loss 0.11290616 - time (sec): 45.79 - samples/sec: 6672.15 - lr: 0.000004 - momentum: 0.000000
|
206 |
+
2023-10-19 01:19:55,727 epoch 9 - iter 3249/3617 - loss 0.11269907 - time (sec): 51.51 - samples/sec: 6647.26 - lr: 0.000004 - momentum: 0.000000
|
207 |
+
2023-10-19 01:20:01,410 epoch 9 - iter 3610/3617 - loss 0.11350583 - time (sec): 57.19 - samples/sec: 6630.19 - lr: 0.000003 - momentum: 0.000000
|
208 |
+
2023-10-19 01:20:01,524 ----------------------------------------------------------------------------------------------------
|
209 |
+
2023-10-19 01:20:01,524 EPOCH 9 done: loss 0.1135 - lr: 0.000003
|
210 |
+
2023-10-19 01:20:05,482 DEV : loss 0.19358040392398834 - f1-score (micro avg) 0.4833
|
211 |
+
2023-10-19 01:20:05,511 ----------------------------------------------------------------------------------------------------
|
212 |
+
2023-10-19 01:20:11,321 epoch 10 - iter 361/3617 - loss 0.11277088 - time (sec): 5.81 - samples/sec: 6246.22 - lr: 0.000003 - momentum: 0.000000
|
213 |
+
2023-10-19 01:20:17,167 epoch 10 - iter 722/3617 - loss 0.11290023 - time (sec): 11.66 - samples/sec: 6445.07 - lr: 0.000003 - momentum: 0.000000
|
214 |
+
2023-10-19 01:20:22,907 epoch 10 - iter 1083/3617 - loss 0.11198872 - time (sec): 17.40 - samples/sec: 6454.90 - lr: 0.000002 - momentum: 0.000000
|
215 |
+
2023-10-19 01:20:28,643 epoch 10 - iter 1444/3617 - loss 0.10861250 - time (sec): 23.13 - samples/sec: 6495.86 - lr: 0.000002 - momentum: 0.000000
|
216 |
+
2023-10-19 01:20:33,825 epoch 10 - iter 1805/3617 - loss 0.11091058 - time (sec): 28.31 - samples/sec: 6672.48 - lr: 0.000002 - momentum: 0.000000
|
217 |
+
2023-10-19 01:20:39,517 epoch 10 - iter 2166/3617 - loss 0.11258390 - time (sec): 34.01 - samples/sec: 6697.61 - lr: 0.000001 - momentum: 0.000000
|
218 |
+
2023-10-19 01:20:45,292 epoch 10 - iter 2527/3617 - loss 0.11155306 - time (sec): 39.78 - samples/sec: 6694.42 - lr: 0.000001 - momentum: 0.000000
|
219 |
+
2023-10-19 01:20:51,014 epoch 10 - iter 2888/3617 - loss 0.10931783 - time (sec): 45.50 - samples/sec: 6687.76 - lr: 0.000001 - momentum: 0.000000
|
220 |
+
2023-10-19 01:20:56,674 epoch 10 - iter 3249/3617 - loss 0.11004335 - time (sec): 51.16 - samples/sec: 6662.31 - lr: 0.000000 - momentum: 0.000000
|
221 |
+
2023-10-19 01:21:02,509 epoch 10 - iter 3610/3617 - loss 0.10947971 - time (sec): 57.00 - samples/sec: 6656.02 - lr: 0.000000 - momentum: 0.000000
|
222 |
+
2023-10-19 01:21:02,609 ----------------------------------------------------------------------------------------------------
|
223 |
+
2023-10-19 01:21:02,609 EPOCH 10 done: loss 0.1097 - lr: 0.000000
|
224 |
+
2023-10-19 01:21:05,822 DEV : loss 0.1967521756887436 - f1-score (micro avg) 0.4832
|
225 |
+
2023-10-19 01:21:05,881 ----------------------------------------------------------------------------------------------------
|
226 |
+
2023-10-19 01:21:05,881 Loading model from best epoch ...
|
227 |
+
2023-10-19 01:21:05,977 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
|
228 |
+
2023-10-19 01:21:10,135
|
229 |
+
Results:
|
230 |
+
- F-score (micro) 0.5273
|
231 |
+
- F-score (macro) 0.3519
|
232 |
+
- Accuracy 0.37
|
233 |
+
|
234 |
+
By class:
|
235 |
+
precision recall f1-score support
|
236 |
+
|
237 |
+
loc 0.5467 0.6734 0.6035 591
|
238 |
+
pers 0.4155 0.4958 0.4521 357
|
239 |
+
org 0.0000 0.0000 0.0000 79
|
240 |
+
|
241 |
+
micro avg 0.4983 0.5599 0.5273 1027
|
242 |
+
macro avg 0.3207 0.3897 0.3519 1027
|
243 |
+
weighted avg 0.4590 0.5599 0.5044 1027
|
244 |
+
|
245 |
+
2023-10-19 01:21:10,135 ----------------------------------------------------------------------------------------------------
|