stefan-it commited on
Commit
812cdb2
1 Parent(s): 0074922

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. best-model.pt +3 -0
  2. dev.tsv +0 -0
  3. loss.tsv +11 -0
  4. test.tsv +0 -0
  5. training.log +240 -0
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ddaf3d3fbddd91429fcbfaed2557e48a0dd43f251231d518b2e577618e09d06d
3
+ size 443311111
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 22:45:42 0.0000 0.3526 0.1015 0.6836 0.7749 0.7264 0.5890
3
+ 2 22:47:17 0.0000 0.1124 0.0955 0.7146 0.7760 0.7440 0.6109
4
+ 3 22:48:52 0.0000 0.0847 0.1280 0.7143 0.7692 0.7407 0.6066
5
+ 4 22:50:26 0.0000 0.0631 0.1820 0.7131 0.7647 0.7380 0.6063
6
+ 5 22:52:01 0.0000 0.0470 0.1834 0.7229 0.7760 0.7485 0.6186
7
+ 6 22:53:35 0.0000 0.0355 0.1920 0.7411 0.7738 0.7571 0.6316
8
+ 7 22:55:10 0.0000 0.0275 0.2074 0.7697 0.7636 0.7666 0.6380
9
+ 8 22:56:45 0.0000 0.0203 0.2161 0.7391 0.7692 0.7539 0.6250
10
+ 9 22:58:19 0.0000 0.0126 0.2227 0.7578 0.7681 0.7629 0.6346
11
+ 10 22:59:54 0.0000 0.0097 0.2288 0.7497 0.7828 0.7659 0.6401
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,240 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-13 22:44:08,972 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-13 22:44:08,973 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-13 22:44:08,973 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-13 22:44:08,973 MultiCorpus: 7936 train + 992 dev + 992 test sentences
52
+ - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
53
+ 2023-10-13 22:44:08,973 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-13 22:44:08,973 Train: 7936 sentences
55
+ 2023-10-13 22:44:08,973 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-13 22:44:08,973 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-13 22:44:08,973 Training Params:
58
+ 2023-10-13 22:44:08,973 - learning_rate: "3e-05"
59
+ 2023-10-13 22:44:08,973 - mini_batch_size: "4"
60
+ 2023-10-13 22:44:08,973 - max_epochs: "10"
61
+ 2023-10-13 22:44:08,973 - shuffle: "True"
62
+ 2023-10-13 22:44:08,973 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-13 22:44:08,973 Plugins:
64
+ 2023-10-13 22:44:08,973 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2023-10-13 22:44:08,974 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-13 22:44:08,974 Final evaluation on model from best epoch (best-model.pt)
67
+ 2023-10-13 22:44:08,974 - metric: "('micro avg', 'f1-score')"
68
+ 2023-10-13 22:44:08,974 ----------------------------------------------------------------------------------------------------
69
+ 2023-10-13 22:44:08,974 Computation:
70
+ 2023-10-13 22:44:08,974 - compute on device: cuda:0
71
+ 2023-10-13 22:44:08,974 - embedding storage: none
72
+ 2023-10-13 22:44:08,974 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-13 22:44:08,974 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
74
+ 2023-10-13 22:44:08,974 ----------------------------------------------------------------------------------------------------
75
+ 2023-10-13 22:44:08,974 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-13 22:44:18,081 epoch 1 - iter 198/1984 - loss 1.81064836 - time (sec): 9.11 - samples/sec: 1829.54 - lr: 0.000003 - momentum: 0.000000
77
+ 2023-10-13 22:44:27,066 epoch 1 - iter 396/1984 - loss 1.08951470 - time (sec): 18.09 - samples/sec: 1811.50 - lr: 0.000006 - momentum: 0.000000
78
+ 2023-10-13 22:44:36,144 epoch 1 - iter 594/1984 - loss 0.79725090 - time (sec): 27.17 - samples/sec: 1822.29 - lr: 0.000009 - momentum: 0.000000
79
+ 2023-10-13 22:44:45,070 epoch 1 - iter 792/1984 - loss 0.65426686 - time (sec): 36.10 - samples/sec: 1820.77 - lr: 0.000012 - momentum: 0.000000
80
+ 2023-10-13 22:44:54,112 epoch 1 - iter 990/1984 - loss 0.55873840 - time (sec): 45.14 - samples/sec: 1826.09 - lr: 0.000015 - momentum: 0.000000
81
+ 2023-10-13 22:45:02,966 epoch 1 - iter 1188/1984 - loss 0.49032838 - time (sec): 53.99 - samples/sec: 1833.75 - lr: 0.000018 - momentum: 0.000000
82
+ 2023-10-13 22:45:11,931 epoch 1 - iter 1386/1984 - loss 0.44094947 - time (sec): 62.96 - samples/sec: 1830.34 - lr: 0.000021 - momentum: 0.000000
83
+ 2023-10-13 22:45:20,963 epoch 1 - iter 1584/1984 - loss 0.40474175 - time (sec): 71.99 - samples/sec: 1830.66 - lr: 0.000024 - momentum: 0.000000
84
+ 2023-10-13 22:45:30,256 epoch 1 - iter 1782/1984 - loss 0.37525747 - time (sec): 81.28 - samples/sec: 1818.44 - lr: 0.000027 - momentum: 0.000000
85
+ 2023-10-13 22:45:39,648 epoch 1 - iter 1980/1984 - loss 0.35281987 - time (sec): 90.67 - samples/sec: 1805.98 - lr: 0.000030 - momentum: 0.000000
86
+ 2023-10-13 22:45:39,839 ----------------------------------------------------------------------------------------------------
87
+ 2023-10-13 22:45:39,839 EPOCH 1 done: loss 0.3526 - lr: 0.000030
88
+ 2023-10-13 22:45:42,914 DEV : loss 0.10146976262331009 - f1-score (micro avg) 0.7264
89
+ 2023-10-13 22:45:42,937 saving best model
90
+ 2023-10-13 22:45:43,336 ----------------------------------------------------------------------------------------------------
91
+ 2023-10-13 22:45:52,299 epoch 2 - iter 198/1984 - loss 0.13085115 - time (sec): 8.96 - samples/sec: 1965.02 - lr: 0.000030 - momentum: 0.000000
92
+ 2023-10-13 22:46:01,550 epoch 2 - iter 396/1984 - loss 0.12180186 - time (sec): 18.21 - samples/sec: 1845.38 - lr: 0.000029 - momentum: 0.000000
93
+ 2023-10-13 22:46:10,520 epoch 2 - iter 594/1984 - loss 0.11729603 - time (sec): 27.18 - samples/sec: 1843.65 - lr: 0.000029 - momentum: 0.000000
94
+ 2023-10-13 22:46:19,430 epoch 2 - iter 792/1984 - loss 0.11624537 - time (sec): 36.09 - samples/sec: 1816.81 - lr: 0.000029 - momentum: 0.000000
95
+ 2023-10-13 22:46:28,458 epoch 2 - iter 990/1984 - loss 0.11635377 - time (sec): 45.12 - samples/sec: 1810.56 - lr: 0.000028 - momentum: 0.000000
96
+ 2023-10-13 22:46:37,650 epoch 2 - iter 1188/1984 - loss 0.11405465 - time (sec): 54.31 - samples/sec: 1800.34 - lr: 0.000028 - momentum: 0.000000
97
+ 2023-10-13 22:46:46,738 epoch 2 - iter 1386/1984 - loss 0.11342189 - time (sec): 63.40 - samples/sec: 1795.44 - lr: 0.000028 - momentum: 0.000000
98
+ 2023-10-13 22:46:55,776 epoch 2 - iter 1584/1984 - loss 0.11424700 - time (sec): 72.44 - samples/sec: 1803.73 - lr: 0.000027 - momentum: 0.000000
99
+ 2023-10-13 22:47:04,814 epoch 2 - iter 1782/1984 - loss 0.11408106 - time (sec): 81.48 - samples/sec: 1794.31 - lr: 0.000027 - momentum: 0.000000
100
+ 2023-10-13 22:47:13,932 epoch 2 - iter 1980/1984 - loss 0.11249722 - time (sec): 90.59 - samples/sec: 1807.49 - lr: 0.000027 - momentum: 0.000000
101
+ 2023-10-13 22:47:14,114 ----------------------------------------------------------------------------------------------------
102
+ 2023-10-13 22:47:14,114 EPOCH 2 done: loss 0.1124 - lr: 0.000027
103
+ 2023-10-13 22:47:17,587 DEV : loss 0.09548649936914444 - f1-score (micro avg) 0.744
104
+ 2023-10-13 22:47:17,608 saving best model
105
+ 2023-10-13 22:47:18,148 ----------------------------------------------------------------------------------------------------
106
+ 2023-10-13 22:47:27,115 epoch 3 - iter 198/1984 - loss 0.06800668 - time (sec): 8.96 - samples/sec: 1822.91 - lr: 0.000026 - momentum: 0.000000
107
+ 2023-10-13 22:47:36,104 epoch 3 - iter 396/1984 - loss 0.07109942 - time (sec): 17.95 - samples/sec: 1798.42 - lr: 0.000026 - momentum: 0.000000
108
+ 2023-10-13 22:47:45,084 epoch 3 - iter 594/1984 - loss 0.07668389 - time (sec): 26.93 - samples/sec: 1795.77 - lr: 0.000026 - momentum: 0.000000
109
+ 2023-10-13 22:47:54,089 epoch 3 - iter 792/1984 - loss 0.07789366 - time (sec): 35.94 - samples/sec: 1817.49 - lr: 0.000025 - momentum: 0.000000
110
+ 2023-10-13 22:48:03,169 epoch 3 - iter 990/1984 - loss 0.07895567 - time (sec): 45.02 - samples/sec: 1825.49 - lr: 0.000025 - momentum: 0.000000
111
+ 2023-10-13 22:48:12,181 epoch 3 - iter 1188/1984 - loss 0.08032932 - time (sec): 54.03 - samples/sec: 1818.24 - lr: 0.000025 - momentum: 0.000000
112
+ 2023-10-13 22:48:21,733 epoch 3 - iter 1386/1984 - loss 0.08285120 - time (sec): 63.58 - samples/sec: 1802.90 - lr: 0.000024 - momentum: 0.000000
113
+ 2023-10-13 22:48:30,700 epoch 3 - iter 1584/1984 - loss 0.08388123 - time (sec): 72.55 - samples/sec: 1799.06 - lr: 0.000024 - momentum: 0.000000
114
+ 2023-10-13 22:48:39,738 epoch 3 - iter 1782/1984 - loss 0.08563997 - time (sec): 81.59 - samples/sec: 1800.26 - lr: 0.000024 - momentum: 0.000000
115
+ 2023-10-13 22:48:48,772 epoch 3 - iter 1980/1984 - loss 0.08470780 - time (sec): 90.62 - samples/sec: 1805.95 - lr: 0.000023 - momentum: 0.000000
116
+ 2023-10-13 22:48:48,950 ----------------------------------------------------------------------------------------------------
117
+ 2023-10-13 22:48:48,950 EPOCH 3 done: loss 0.0847 - lr: 0.000023
118
+ 2023-10-13 22:48:52,399 DEV : loss 0.12796364724636078 - f1-score (micro avg) 0.7407
119
+ 2023-10-13 22:48:52,419 ----------------------------------------------------------------------------------------------------
120
+ 2023-10-13 22:49:01,819 epoch 4 - iter 198/1984 - loss 0.05800278 - time (sec): 9.40 - samples/sec: 1812.92 - lr: 0.000023 - momentum: 0.000000
121
+ 2023-10-13 22:49:10,901 epoch 4 - iter 396/1984 - loss 0.05641167 - time (sec): 18.48 - samples/sec: 1793.44 - lr: 0.000023 - momentum: 0.000000
122
+ 2023-10-13 22:49:19,796 epoch 4 - iter 594/1984 - loss 0.05743702 - time (sec): 27.38 - samples/sec: 1787.23 - lr: 0.000022 - momentum: 0.000000
123
+ 2023-10-13 22:49:28,726 epoch 4 - iter 792/1984 - loss 0.05915402 - time (sec): 36.31 - samples/sec: 1798.85 - lr: 0.000022 - momentum: 0.000000
124
+ 2023-10-13 22:49:37,841 epoch 4 - iter 990/1984 - loss 0.05947704 - time (sec): 45.42 - samples/sec: 1795.29 - lr: 0.000022 - momentum: 0.000000
125
+ 2023-10-13 22:49:46,872 epoch 4 - iter 1188/1984 - loss 0.06239465 - time (sec): 54.45 - samples/sec: 1801.43 - lr: 0.000021 - momentum: 0.000000
126
+ 2023-10-13 22:49:55,856 epoch 4 - iter 1386/1984 - loss 0.06300882 - time (sec): 63.44 - samples/sec: 1807.04 - lr: 0.000021 - momentum: 0.000000
127
+ 2023-10-13 22:50:04,872 epoch 4 - iter 1584/1984 - loss 0.06319248 - time (sec): 72.45 - samples/sec: 1804.06 - lr: 0.000021 - momentum: 0.000000
128
+ 2023-10-13 22:50:13,909 epoch 4 - iter 1782/1984 - loss 0.06395252 - time (sec): 81.49 - samples/sec: 1810.75 - lr: 0.000020 - momentum: 0.000000
129
+ 2023-10-13 22:50:22,853 epoch 4 - iter 1980/1984 - loss 0.06299855 - time (sec): 90.43 - samples/sec: 1811.89 - lr: 0.000020 - momentum: 0.000000
130
+ 2023-10-13 22:50:23,030 ----------------------------------------------------------------------------------------------------
131
+ 2023-10-13 22:50:23,030 EPOCH 4 done: loss 0.0631 - lr: 0.000020
132
+ 2023-10-13 22:50:26,447 DEV : loss 0.18204569816589355 - f1-score (micro avg) 0.738
133
+ 2023-10-13 22:50:26,468 ----------------------------------------------------------------------------------------------------
134
+ 2023-10-13 22:50:35,948 epoch 5 - iter 198/1984 - loss 0.04150094 - time (sec): 9.48 - samples/sec: 1789.33 - lr: 0.000020 - momentum: 0.000000
135
+ 2023-10-13 22:50:45,154 epoch 5 - iter 396/1984 - loss 0.04233067 - time (sec): 18.69 - samples/sec: 1769.03 - lr: 0.000019 - momentum: 0.000000
136
+ 2023-10-13 22:50:54,257 epoch 5 - iter 594/1984 - loss 0.04297481 - time (sec): 27.79 - samples/sec: 1815.26 - lr: 0.000019 - momentum: 0.000000
137
+ 2023-10-13 22:51:03,331 epoch 5 - iter 792/1984 - loss 0.04244711 - time (sec): 36.86 - samples/sec: 1802.29 - lr: 0.000019 - momentum: 0.000000
138
+ 2023-10-13 22:51:12,472 epoch 5 - iter 990/1984 - loss 0.04326527 - time (sec): 46.00 - samples/sec: 1802.41 - lr: 0.000018 - momentum: 0.000000
139
+ 2023-10-13 22:51:21,600 epoch 5 - iter 1188/1984 - loss 0.04560535 - time (sec): 55.13 - samples/sec: 1794.02 - lr: 0.000018 - momentum: 0.000000
140
+ 2023-10-13 22:51:30,456 epoch 5 - iter 1386/1984 - loss 0.04527611 - time (sec): 63.99 - samples/sec: 1788.59 - lr: 0.000018 - momentum: 0.000000
141
+ 2023-10-13 22:51:39,401 epoch 5 - iter 1584/1984 - loss 0.04572642 - time (sec): 72.93 - samples/sec: 1790.69 - lr: 0.000017 - momentum: 0.000000
142
+ 2023-10-13 22:51:48,362 epoch 5 - iter 1782/1984 - loss 0.04653709 - time (sec): 81.89 - samples/sec: 1806.98 - lr: 0.000017 - momentum: 0.000000
143
+ 2023-10-13 22:51:57,204 epoch 5 - iter 1980/1984 - loss 0.04705024 - time (sec): 90.73 - samples/sec: 1801.81 - lr: 0.000017 - momentum: 0.000000
144
+ 2023-10-13 22:51:57,437 ----------------------------------------------------------------------------------------------------
145
+ 2023-10-13 22:51:57,437 EPOCH 5 done: loss 0.0470 - lr: 0.000017
146
+ 2023-10-13 22:52:01,269 DEV : loss 0.18338747322559357 - f1-score (micro avg) 0.7485
147
+ 2023-10-13 22:52:01,290 saving best model
148
+ 2023-10-13 22:52:01,805 ----------------------------------------------------------------------------------------------------
149
+ 2023-10-13 22:52:10,662 epoch 6 - iter 198/1984 - loss 0.03423276 - time (sec): 8.85 - samples/sec: 1775.29 - lr: 0.000016 - momentum: 0.000000
150
+ 2023-10-13 22:52:19,672 epoch 6 - iter 396/1984 - loss 0.03504659 - time (sec): 17.86 - samples/sec: 1783.45 - lr: 0.000016 - momentum: 0.000000
151
+ 2023-10-13 22:52:28,638 epoch 6 - iter 594/1984 - loss 0.03522723 - time (sec): 26.83 - samples/sec: 1801.61 - lr: 0.000016 - momentum: 0.000000
152
+ 2023-10-13 22:52:37,746 epoch 6 - iter 792/1984 - loss 0.03512497 - time (sec): 35.93 - samples/sec: 1812.74 - lr: 0.000015 - momentum: 0.000000
153
+ 2023-10-13 22:52:46,705 epoch 6 - iter 990/1984 - loss 0.03574142 - time (sec): 44.89 - samples/sec: 1815.89 - lr: 0.000015 - momentum: 0.000000
154
+ 2023-10-13 22:52:55,725 epoch 6 - iter 1188/1984 - loss 0.03636170 - time (sec): 53.91 - samples/sec: 1817.02 - lr: 0.000015 - momentum: 0.000000
155
+ 2023-10-13 22:53:04,837 epoch 6 - iter 1386/1984 - loss 0.03573438 - time (sec): 63.03 - samples/sec: 1821.47 - lr: 0.000014 - momentum: 0.000000
156
+ 2023-10-13 22:53:13,757 epoch 6 - iter 1584/1984 - loss 0.03570621 - time (sec): 71.95 - samples/sec: 1821.80 - lr: 0.000014 - momentum: 0.000000
157
+ 2023-10-13 22:53:23,051 epoch 6 - iter 1782/1984 - loss 0.03561765 - time (sec): 81.24 - samples/sec: 1803.74 - lr: 0.000014 - momentum: 0.000000
158
+ 2023-10-13 22:53:32,057 epoch 6 - iter 1980/1984 - loss 0.03560613 - time (sec): 90.25 - samples/sec: 1811.62 - lr: 0.000013 - momentum: 0.000000
159
+ 2023-10-13 22:53:32,243 ----------------------------------------------------------------------------------------------------
160
+ 2023-10-13 22:53:32,243 EPOCH 6 done: loss 0.0355 - lr: 0.000013
161
+ 2023-10-13 22:53:35,667 DEV : loss 0.19200846552848816 - f1-score (micro avg) 0.7571
162
+ 2023-10-13 22:53:35,690 saving best model
163
+ 2023-10-13 22:53:36,227 ----------------------------------------------------------------------------------------------------
164
+ 2023-10-13 22:53:45,464 epoch 7 - iter 198/1984 - loss 0.01991158 - time (sec): 9.23 - samples/sec: 1858.36 - lr: 0.000013 - momentum: 0.000000
165
+ 2023-10-13 22:53:54,516 epoch 7 - iter 396/1984 - loss 0.02651805 - time (sec): 18.28 - samples/sec: 1844.81 - lr: 0.000013 - momentum: 0.000000
166
+ 2023-10-13 22:54:03,473 epoch 7 - iter 594/1984 - loss 0.02637781 - time (sec): 27.24 - samples/sec: 1848.97 - lr: 0.000012 - momentum: 0.000000
167
+ 2023-10-13 22:54:12,481 epoch 7 - iter 792/1984 - loss 0.02677852 - time (sec): 36.25 - samples/sec: 1845.67 - lr: 0.000012 - momentum: 0.000000
168
+ 2023-10-13 22:54:21,402 epoch 7 - iter 990/1984 - loss 0.02881749 - time (sec): 45.17 - samples/sec: 1842.80 - lr: 0.000012 - momentum: 0.000000
169
+ 2023-10-13 22:54:30,402 epoch 7 - iter 1188/1984 - loss 0.02858223 - time (sec): 54.17 - samples/sec: 1853.37 - lr: 0.000011 - momentum: 0.000000
170
+ 2023-10-13 22:54:39,379 epoch 7 - iter 1386/1984 - loss 0.02830102 - time (sec): 63.15 - samples/sec: 1833.93 - lr: 0.000011 - momentum: 0.000000
171
+ 2023-10-13 22:54:48,384 epoch 7 - iter 1584/1984 - loss 0.02747017 - time (sec): 72.15 - samples/sec: 1827.82 - lr: 0.000011 - momentum: 0.000000
172
+ 2023-10-13 22:54:57,360 epoch 7 - iter 1782/1984 - loss 0.02717714 - time (sec): 81.13 - samples/sec: 1826.93 - lr: 0.000010 - momentum: 0.000000
173
+ 2023-10-13 22:55:06,509 epoch 7 - iter 1980/1984 - loss 0.02758899 - time (sec): 90.28 - samples/sec: 1812.63 - lr: 0.000010 - momentum: 0.000000
174
+ 2023-10-13 22:55:06,690 ----------------------------------------------------------------------------------------------------
175
+ 2023-10-13 22:55:06,690 EPOCH 7 done: loss 0.0275 - lr: 0.000010
176
+ 2023-10-13 22:55:10,051 DEV : loss 0.20744635164737701 - f1-score (micro avg) 0.7666
177
+ 2023-10-13 22:55:10,071 saving best model
178
+ 2023-10-13 22:55:10,919 ----------------------------------------------------------------------------------------------------
179
+ 2023-10-13 22:55:19,880 epoch 8 - iter 198/1984 - loss 0.02610314 - time (sec): 8.96 - samples/sec: 1850.88 - lr: 0.000010 - momentum: 0.000000
180
+ 2023-10-13 22:55:28,864 epoch 8 - iter 396/1984 - loss 0.02132016 - time (sec): 17.94 - samples/sec: 1827.81 - lr: 0.000009 - momentum: 0.000000
181
+ 2023-10-13 22:55:38,088 epoch 8 - iter 594/1984 - loss 0.02012992 - time (sec): 27.17 - samples/sec: 1814.18 - lr: 0.000009 - momentum: 0.000000
182
+ 2023-10-13 22:55:47,028 epoch 8 - iter 792/1984 - loss 0.02027899 - time (sec): 36.11 - samples/sec: 1807.99 - lr: 0.000009 - momentum: 0.000000
183
+ 2023-10-13 22:55:56,139 epoch 8 - iter 990/1984 - loss 0.01932362 - time (sec): 45.22 - samples/sec: 1797.05 - lr: 0.000008 - momentum: 0.000000
184
+ 2023-10-13 22:56:05,665 epoch 8 - iter 1188/1984 - loss 0.02059177 - time (sec): 54.74 - samples/sec: 1790.75 - lr: 0.000008 - momentum: 0.000000
185
+ 2023-10-13 22:56:14,696 epoch 8 - iter 1386/1984 - loss 0.02038701 - time (sec): 63.78 - samples/sec: 1802.34 - lr: 0.000008 - momentum: 0.000000
186
+ 2023-10-13 22:56:23,565 epoch 8 - iter 1584/1984 - loss 0.02069282 - time (sec): 72.64 - samples/sec: 1803.00 - lr: 0.000007 - momentum: 0.000000
187
+ 2023-10-13 22:56:32,647 epoch 8 - iter 1782/1984 - loss 0.02056342 - time (sec): 81.73 - samples/sec: 1798.72 - lr: 0.000007 - momentum: 0.000000
188
+ 2023-10-13 22:56:41,597 epoch 8 - iter 1980/1984 - loss 0.02020247 - time (sec): 90.68 - samples/sec: 1804.78 - lr: 0.000007 - momentum: 0.000000
189
+ 2023-10-13 22:56:41,779 ----------------------------------------------------------------------------------------------------
190
+ 2023-10-13 22:56:41,779 EPOCH 8 done: loss 0.0203 - lr: 0.000007
191
+ 2023-10-13 22:56:45,247 DEV : loss 0.2160894274711609 - f1-score (micro avg) 0.7539
192
+ 2023-10-13 22:56:45,268 ----------------------------------------------------------------------------------------------------
193
+ 2023-10-13 22:56:54,505 epoch 9 - iter 198/1984 - loss 0.00746271 - time (sec): 9.24 - samples/sec: 1666.05 - lr: 0.000006 - momentum: 0.000000
194
+ 2023-10-13 22:57:03,468 epoch 9 - iter 396/1984 - loss 0.01246335 - time (sec): 18.20 - samples/sec: 1713.75 - lr: 0.000006 - momentum: 0.000000
195
+ 2023-10-13 22:57:12,460 epoch 9 - iter 594/1984 - loss 0.01222490 - time (sec): 27.19 - samples/sec: 1776.85 - lr: 0.000006 - momentum: 0.000000
196
+ 2023-10-13 22:57:21,470 epoch 9 - iter 792/1984 - loss 0.01220938 - time (sec): 36.20 - samples/sec: 1795.95 - lr: 0.000005 - momentum: 0.000000
197
+ 2023-10-13 22:57:30,847 epoch 9 - iter 990/1984 - loss 0.01368129 - time (sec): 45.58 - samples/sec: 1814.95 - lr: 0.000005 - momentum: 0.000000
198
+ 2023-10-13 22:57:39,889 epoch 9 - iter 1188/1984 - loss 0.01266492 - time (sec): 54.62 - samples/sec: 1814.26 - lr: 0.000005 - momentum: 0.000000
199
+ 2023-10-13 22:57:49,004 epoch 9 - iter 1386/1984 - loss 0.01257140 - time (sec): 63.74 - samples/sec: 1808.68 - lr: 0.000004 - momentum: 0.000000
200
+ 2023-10-13 22:57:58,054 epoch 9 - iter 1584/1984 - loss 0.01226365 - time (sec): 72.79 - samples/sec: 1807.51 - lr: 0.000004 - momentum: 0.000000
201
+ 2023-10-13 22:58:06,967 epoch 9 - iter 1782/1984 - loss 0.01268987 - time (sec): 81.70 - samples/sec: 1802.17 - lr: 0.000004 - momentum: 0.000000
202
+ 2023-10-13 22:58:15,967 epoch 9 - iter 1980/1984 - loss 0.01248644 - time (sec): 90.70 - samples/sec: 1804.81 - lr: 0.000003 - momentum: 0.000000
203
+ 2023-10-13 22:58:16,146 ----------------------------------------------------------------------------------------------------
204
+ 2023-10-13 22:58:16,146 EPOCH 9 done: loss 0.0126 - lr: 0.000003
205
+ 2023-10-13 22:58:19,625 DEV : loss 0.22268585860729218 - f1-score (micro avg) 0.7629
206
+ 2023-10-13 22:58:19,646 ----------------------------------------------------------------------------------------------------
207
+ 2023-10-13 22:58:28,696 epoch 10 - iter 198/1984 - loss 0.01007200 - time (sec): 9.05 - samples/sec: 1701.45 - lr: 0.000003 - momentum: 0.000000
208
+ 2023-10-13 22:58:37,663 epoch 10 - iter 396/1984 - loss 0.00885874 - time (sec): 18.02 - samples/sec: 1751.39 - lr: 0.000003 - momentum: 0.000000
209
+ 2023-10-13 22:58:46,584 epoch 10 - iter 594/1984 - loss 0.00972978 - time (sec): 26.94 - samples/sec: 1752.48 - lr: 0.000002 - momentum: 0.000000
210
+ 2023-10-13 22:58:55,648 epoch 10 - iter 792/1984 - loss 0.00994088 - time (sec): 36.00 - samples/sec: 1761.21 - lr: 0.000002 - momentum: 0.000000
211
+ 2023-10-13 22:59:04,705 epoch 10 - iter 990/1984 - loss 0.01002788 - time (sec): 45.06 - samples/sec: 1769.36 - lr: 0.000002 - momentum: 0.000000
212
+ 2023-10-13 22:59:13,711 epoch 10 - iter 1188/1984 - loss 0.00996187 - time (sec): 54.06 - samples/sec: 1790.42 - lr: 0.000001 - momentum: 0.000000
213
+ 2023-10-13 22:59:22,835 epoch 10 - iter 1386/1984 - loss 0.00984624 - time (sec): 63.19 - samples/sec: 1811.97 - lr: 0.000001 - momentum: 0.000000
214
+ 2023-10-13 22:59:31,920 epoch 10 - iter 1584/1984 - loss 0.00915830 - time (sec): 72.27 - samples/sec: 1815.67 - lr: 0.000001 - momentum: 0.000000
215
+ 2023-10-13 22:59:41,048 epoch 10 - iter 1782/1984 - loss 0.00916070 - time (sec): 81.40 - samples/sec: 1810.01 - lr: 0.000000 - momentum: 0.000000
216
+ 2023-10-13 22:59:50,115 epoch 10 - iter 1980/1984 - loss 0.00968803 - time (sec): 90.47 - samples/sec: 1809.43 - lr: 0.000000 - momentum: 0.000000
217
+ 2023-10-13 22:59:50,294 ----------------------------------------------------------------------------------------------------
218
+ 2023-10-13 22:59:50,294 EPOCH 10 done: loss 0.0097 - lr: 0.000000
219
+ 2023-10-13 22:59:54,151 DEV : loss 0.2287728190422058 - f1-score (micro avg) 0.7659
220
+ 2023-10-13 22:59:54,607 ----------------------------------------------------------------------------------------------------
221
+ 2023-10-13 22:59:54,608 Loading model from best epoch ...
222
+ 2023-10-13 22:59:56,028 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
223
+ 2023-10-13 22:59:59,358
224
+ Results:
225
+ - F-score (micro) 0.7872
226
+ - F-score (macro) 0.6794
227
+ - Accuracy 0.6667
228
+
229
+ By class:
230
+ precision recall f1-score support
231
+
232
+ LOC 0.8251 0.8718 0.8478 655
233
+ PER 0.7406 0.7937 0.7662 223
234
+ ORG 0.5915 0.3307 0.4242 127
235
+
236
+ micro avg 0.7884 0.7861 0.7872 1005
237
+ macro avg 0.7191 0.6654 0.6794 1005
238
+ weighted avg 0.7769 0.7861 0.7762 1005
239
+
240
+ 2023-10-13 22:59:59,358 ----------------------------------------------------------------------------------------------------