stefan-it commited on
Commit
befa204
1 Parent(s): c887fd5

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. best-model.pt +3 -0
  2. dev.tsv +0 -0
  3. loss.tsv +11 -0
  4. test.tsv +0 -0
  5. training.log +244 -0
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ae720892022cb422e77268145752600df960d4e48309b10453785e508757a790
3
+ size 443335879
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 17:47:59 0.0000 0.5507 0.1279 0.6947 0.7325 0.7131 0.5859
3
+ 2 17:49:01 0.0000 0.1189 0.1320 0.6847 0.7835 0.7308 0.6129
4
+ 3 17:50:03 0.0000 0.0725 0.1486 0.7662 0.8013 0.7833 0.6749
5
+ 4 17:51:04 0.0000 0.0533 0.1738 0.7839 0.8270 0.8049 0.6942
6
+ 5 17:52:05 0.0000 0.0360 0.1813 0.7989 0.8373 0.8177 0.7139
7
+ 6 17:53:07 0.0000 0.0228 0.2183 0.7811 0.8173 0.7988 0.6944
8
+ 7 17:54:08 0.0000 0.0186 0.2016 0.8218 0.8293 0.8255 0.7258
9
+ 8 17:55:10 0.0000 0.0120 0.2121 0.7986 0.8356 0.8167 0.7177
10
+ 9 17:56:10 0.0000 0.0075 0.2237 0.8136 0.8351 0.8242 0.7254
11
+ 10 17:57:12 0.0000 0.0049 0.2252 0.8167 0.8368 0.8266 0.7283
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,244 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-13 17:47:04,163 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-13 17:47:04,164 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=21, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-13 17:47:04,164 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
52
+ - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
53
+ 2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-13 17:47:04,164 Train: 5901 sentences
55
+ 2023-10-13 17:47:04,164 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-13 17:47:04,164 Training Params:
58
+ 2023-10-13 17:47:04,164 - learning_rate: "5e-05"
59
+ 2023-10-13 17:47:04,164 - mini_batch_size: "8"
60
+ 2023-10-13 17:47:04,164 - max_epochs: "10"
61
+ 2023-10-13 17:47:04,164 - shuffle: "True"
62
+ 2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-13 17:47:04,164 Plugins:
64
+ 2023-10-13 17:47:04,164 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-13 17:47:04,164 Final evaluation on model from best epoch (best-model.pt)
67
+ 2023-10-13 17:47:04,164 - metric: "('micro avg', 'f1-score')"
68
+ 2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
69
+ 2023-10-13 17:47:04,164 Computation:
70
+ 2023-10-13 17:47:04,164 - compute on device: cuda:0
71
+ 2023-10-13 17:47:04,164 - embedding storage: none
72
+ 2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-13 17:47:04,164 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
74
+ 2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
75
+ 2023-10-13 17:47:04,165 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-13 17:47:09,303 epoch 1 - iter 73/738 - loss 2.61704957 - time (sec): 5.14 - samples/sec: 3329.07 - lr: 0.000005 - momentum: 0.000000
77
+ 2023-10-13 17:47:14,890 epoch 1 - iter 146/738 - loss 1.63900339 - time (sec): 10.72 - samples/sec: 3355.14 - lr: 0.000010 - momentum: 0.000000
78
+ 2023-10-13 17:47:19,561 epoch 1 - iter 219/738 - loss 1.25224248 - time (sec): 15.40 - samples/sec: 3383.08 - lr: 0.000015 - momentum: 0.000000
79
+ 2023-10-13 17:47:24,159 epoch 1 - iter 292/738 - loss 1.03925094 - time (sec): 19.99 - samples/sec: 3392.19 - lr: 0.000020 - momentum: 0.000000
80
+ 2023-10-13 17:47:28,728 epoch 1 - iter 365/738 - loss 0.89541308 - time (sec): 24.56 - samples/sec: 3399.05 - lr: 0.000025 - momentum: 0.000000
81
+ 2023-10-13 17:47:33,630 epoch 1 - iter 438/738 - loss 0.79056877 - time (sec): 29.46 - samples/sec: 3393.34 - lr: 0.000030 - momentum: 0.000000
82
+ 2023-10-13 17:47:37,911 epoch 1 - iter 511/738 - loss 0.71979193 - time (sec): 33.75 - samples/sec: 3392.11 - lr: 0.000035 - momentum: 0.000000
83
+ 2023-10-13 17:47:42,768 epoch 1 - iter 584/738 - loss 0.65785367 - time (sec): 38.60 - samples/sec: 3381.31 - lr: 0.000039 - momentum: 0.000000
84
+ 2023-10-13 17:47:47,675 epoch 1 - iter 657/738 - loss 0.60430136 - time (sec): 43.51 - samples/sec: 3373.75 - lr: 0.000044 - momentum: 0.000000
85
+ 2023-10-13 17:47:53,061 epoch 1 - iter 730/738 - loss 0.55420001 - time (sec): 48.90 - samples/sec: 3371.78 - lr: 0.000049 - momentum: 0.000000
86
+ 2023-10-13 17:47:53,539 ----------------------------------------------------------------------------------------------------
87
+ 2023-10-13 17:47:53,540 EPOCH 1 done: loss 0.5507 - lr: 0.000049
88
+ 2023-10-13 17:47:59,706 DEV : loss 0.12785974144935608 - f1-score (micro avg) 0.7131
89
+ 2023-10-13 17:47:59,734 saving best model
90
+ 2023-10-13 17:48:00,205 ----------------------------------------------------------------------------------------------------
91
+ 2023-10-13 17:48:04,704 epoch 2 - iter 73/738 - loss 0.14031504 - time (sec): 4.50 - samples/sec: 3262.14 - lr: 0.000049 - momentum: 0.000000
92
+ 2023-10-13 17:48:09,192 epoch 2 - iter 146/738 - loss 0.13796547 - time (sec): 8.99 - samples/sec: 3313.17 - lr: 0.000049 - momentum: 0.000000
93
+ 2023-10-13 17:48:14,464 epoch 2 - iter 219/738 - loss 0.13466397 - time (sec): 14.26 - samples/sec: 3360.71 - lr: 0.000048 - momentum: 0.000000
94
+ 2023-10-13 17:48:19,296 epoch 2 - iter 292/738 - loss 0.13225824 - time (sec): 19.09 - samples/sec: 3355.88 - lr: 0.000048 - momentum: 0.000000
95
+ 2023-10-13 17:48:24,137 epoch 2 - iter 365/738 - loss 0.13022111 - time (sec): 23.93 - samples/sec: 3350.86 - lr: 0.000047 - momentum: 0.000000
96
+ 2023-10-13 17:48:29,160 epoch 2 - iter 438/738 - loss 0.12720644 - time (sec): 28.95 - samples/sec: 3364.48 - lr: 0.000047 - momentum: 0.000000
97
+ 2023-10-13 17:48:34,186 epoch 2 - iter 511/738 - loss 0.12457501 - time (sec): 33.98 - samples/sec: 3340.81 - lr: 0.000046 - momentum: 0.000000
98
+ 2023-10-13 17:48:38,951 epoch 2 - iter 584/738 - loss 0.12421710 - time (sec): 38.74 - samples/sec: 3349.70 - lr: 0.000046 - momentum: 0.000000
99
+ 2023-10-13 17:48:44,401 epoch 2 - iter 657/738 - loss 0.12162126 - time (sec): 44.19 - samples/sec: 3350.95 - lr: 0.000045 - momentum: 0.000000
100
+ 2023-10-13 17:48:49,387 epoch 2 - iter 730/738 - loss 0.11931305 - time (sec): 49.18 - samples/sec: 3349.23 - lr: 0.000045 - momentum: 0.000000
101
+ 2023-10-13 17:48:49,874 ----------------------------------------------------------------------------------------------------
102
+ 2023-10-13 17:48:49,874 EPOCH 2 done: loss 0.1189 - lr: 0.000045
103
+ 2023-10-13 17:49:01,090 DEV : loss 0.13197503983974457 - f1-score (micro avg) 0.7308
104
+ 2023-10-13 17:49:01,119 saving best model
105
+ 2023-10-13 17:49:01,598 ----------------------------------------------------------------------------------------------------
106
+ 2023-10-13 17:49:06,572 epoch 3 - iter 73/738 - loss 0.07614814 - time (sec): 4.97 - samples/sec: 3274.66 - lr: 0.000044 - momentum: 0.000000
107
+ 2023-10-13 17:49:11,815 epoch 3 - iter 146/738 - loss 0.07670962 - time (sec): 10.21 - samples/sec: 3311.41 - lr: 0.000043 - momentum: 0.000000
108
+ 2023-10-13 17:49:16,339 epoch 3 - iter 219/738 - loss 0.07615619 - time (sec): 14.74 - samples/sec: 3339.95 - lr: 0.000043 - momentum: 0.000000
109
+ 2023-10-13 17:49:21,598 epoch 3 - iter 292/738 - loss 0.08533494 - time (sec): 20.00 - samples/sec: 3355.66 - lr: 0.000042 - momentum: 0.000000
110
+ 2023-10-13 17:49:26,329 epoch 3 - iter 365/738 - loss 0.08052962 - time (sec): 24.73 - samples/sec: 3350.28 - lr: 0.000042 - momentum: 0.000000
111
+ 2023-10-13 17:49:31,257 epoch 3 - iter 438/738 - loss 0.07707434 - time (sec): 29.65 - samples/sec: 3330.19 - lr: 0.000041 - momentum: 0.000000
112
+ 2023-10-13 17:49:36,060 epoch 3 - iter 511/738 - loss 0.07639684 - time (sec): 34.46 - samples/sec: 3344.43 - lr: 0.000041 - momentum: 0.000000
113
+ 2023-10-13 17:49:41,408 epoch 3 - iter 584/738 - loss 0.07393004 - time (sec): 39.80 - samples/sec: 3333.01 - lr: 0.000040 - momentum: 0.000000
114
+ 2023-10-13 17:49:46,311 epoch 3 - iter 657/738 - loss 0.07260392 - time (sec): 44.71 - samples/sec: 3315.95 - lr: 0.000040 - momentum: 0.000000
115
+ 2023-10-13 17:49:51,524 epoch 3 - iter 730/738 - loss 0.07247522 - time (sec): 49.92 - samples/sec: 3305.85 - lr: 0.000039 - momentum: 0.000000
116
+ 2023-10-13 17:49:51,967 ----------------------------------------------------------------------------------------------------
117
+ 2023-10-13 17:49:51,967 EPOCH 3 done: loss 0.0725 - lr: 0.000039
118
+ 2023-10-13 17:50:03,343 DEV : loss 0.1486140638589859 - f1-score (micro avg) 0.7833
119
+ 2023-10-13 17:50:03,372 saving best model
120
+ 2023-10-13 17:50:03,852 ----------------------------------------------------------------------------------------------------
121
+ 2023-10-13 17:50:09,135 epoch 4 - iter 73/738 - loss 0.05135216 - time (sec): 5.28 - samples/sec: 3383.66 - lr: 0.000038 - momentum: 0.000000
122
+ 2023-10-13 17:50:13,787 epoch 4 - iter 146/738 - loss 0.05140785 - time (sec): 9.93 - samples/sec: 3335.68 - lr: 0.000038 - momentum: 0.000000
123
+ 2023-10-13 17:50:19,538 epoch 4 - iter 219/738 - loss 0.04837867 - time (sec): 15.68 - samples/sec: 3374.63 - lr: 0.000037 - momentum: 0.000000
124
+ 2023-10-13 17:50:24,656 epoch 4 - iter 292/738 - loss 0.05298887 - time (sec): 20.80 - samples/sec: 3349.51 - lr: 0.000037 - momentum: 0.000000
125
+ 2023-10-13 17:50:29,288 epoch 4 - iter 365/738 - loss 0.05192526 - time (sec): 25.43 - samples/sec: 3358.42 - lr: 0.000036 - momentum: 0.000000
126
+ 2023-10-13 17:50:34,582 epoch 4 - iter 438/738 - loss 0.05297484 - time (sec): 30.72 - samples/sec: 3368.28 - lr: 0.000036 - momentum: 0.000000
127
+ 2023-10-13 17:50:39,204 epoch 4 - iter 511/738 - loss 0.05272183 - time (sec): 35.34 - samples/sec: 3368.87 - lr: 0.000035 - momentum: 0.000000
128
+ 2023-10-13 17:50:43,882 epoch 4 - iter 584/738 - loss 0.05406746 - time (sec): 40.02 - samples/sec: 3349.33 - lr: 0.000035 - momentum: 0.000000
129
+ 2023-10-13 17:50:48,214 epoch 4 - iter 657/738 - loss 0.05441271 - time (sec): 44.35 - samples/sec: 3352.26 - lr: 0.000034 - momentum: 0.000000
130
+ 2023-10-13 17:50:52,924 epoch 4 - iter 730/738 - loss 0.05358667 - time (sec): 49.06 - samples/sec: 3359.88 - lr: 0.000033 - momentum: 0.000000
131
+ 2023-10-13 17:50:53,379 ----------------------------------------------------------------------------------------------------
132
+ 2023-10-13 17:50:53,379 EPOCH 4 done: loss 0.0533 - lr: 0.000033
133
+ 2023-10-13 17:51:04,589 DEV : loss 0.1737738847732544 - f1-score (micro avg) 0.8049
134
+ 2023-10-13 17:51:04,619 saving best model
135
+ 2023-10-13 17:51:05,140 ----------------------------------------------------------------------------------------------------
136
+ 2023-10-13 17:51:10,235 epoch 5 - iter 73/738 - loss 0.03032376 - time (sec): 5.09 - samples/sec: 3312.13 - lr: 0.000033 - momentum: 0.000000
137
+ 2023-10-13 17:51:14,799 epoch 5 - iter 146/738 - loss 0.03581647 - time (sec): 9.65 - samples/sec: 3328.98 - lr: 0.000032 - momentum: 0.000000
138
+ 2023-10-13 17:51:19,339 epoch 5 - iter 219/738 - loss 0.03520240 - time (sec): 14.19 - samples/sec: 3390.73 - lr: 0.000032 - momentum: 0.000000
139
+ 2023-10-13 17:51:24,352 epoch 5 - iter 292/738 - loss 0.03677044 - time (sec): 19.21 - samples/sec: 3407.81 - lr: 0.000031 - momentum: 0.000000
140
+ 2023-10-13 17:51:29,378 epoch 5 - iter 365/738 - loss 0.03507287 - time (sec): 24.23 - samples/sec: 3366.45 - lr: 0.000031 - momentum: 0.000000
141
+ 2023-10-13 17:51:34,241 epoch 5 - iter 438/738 - loss 0.03455833 - time (sec): 29.10 - samples/sec: 3355.44 - lr: 0.000030 - momentum: 0.000000
142
+ 2023-10-13 17:51:39,901 epoch 5 - iter 511/738 - loss 0.03549297 - time (sec): 34.76 - samples/sec: 3357.51 - lr: 0.000030 - momentum: 0.000000
143
+ 2023-10-13 17:51:44,117 epoch 5 - iter 584/738 - loss 0.03649335 - time (sec): 38.97 - samples/sec: 3377.19 - lr: 0.000029 - momentum: 0.000000
144
+ 2023-10-13 17:51:49,172 epoch 5 - iter 657/738 - loss 0.03572356 - time (sec): 44.03 - samples/sec: 3376.14 - lr: 0.000028 - momentum: 0.000000
145
+ 2023-10-13 17:51:53,851 epoch 5 - iter 730/738 - loss 0.03593262 - time (sec): 48.71 - samples/sec: 3383.72 - lr: 0.000028 - momentum: 0.000000
146
+ 2023-10-13 17:51:54,287 ----------------------------------------------------------------------------------------------------
147
+ 2023-10-13 17:51:54,287 EPOCH 5 done: loss 0.0360 - lr: 0.000028
148
+ 2023-10-13 17:52:05,499 DEV : loss 0.1812753677368164 - f1-score (micro avg) 0.8177
149
+ 2023-10-13 17:52:05,531 saving best model
150
+ 2023-10-13 17:52:06,113 ----------------------------------------------------------------------------------------------------
151
+ 2023-10-13 17:52:11,732 epoch 6 - iter 73/738 - loss 0.01621660 - time (sec): 5.61 - samples/sec: 3000.04 - lr: 0.000027 - momentum: 0.000000
152
+ 2023-10-13 17:52:16,775 epoch 6 - iter 146/738 - loss 0.02080977 - time (sec): 10.66 - samples/sec: 3100.09 - lr: 0.000027 - momentum: 0.000000
153
+ 2023-10-13 17:52:21,263 epoch 6 - iter 219/738 - loss 0.01876135 - time (sec): 15.14 - samples/sec: 3144.78 - lr: 0.000026 - momentum: 0.000000
154
+ 2023-10-13 17:52:25,848 epoch 6 - iter 292/738 - loss 0.02138464 - time (sec): 19.73 - samples/sec: 3183.99 - lr: 0.000026 - momentum: 0.000000
155
+ 2023-10-13 17:52:31,015 epoch 6 - iter 365/738 - loss 0.01978816 - time (sec): 24.90 - samples/sec: 3209.75 - lr: 0.000025 - momentum: 0.000000
156
+ 2023-10-13 17:52:35,258 epoch 6 - iter 438/738 - loss 0.01938038 - time (sec): 29.14 - samples/sec: 3225.51 - lr: 0.000025 - momentum: 0.000000
157
+ 2023-10-13 17:52:40,274 epoch 6 - iter 511/738 - loss 0.01928215 - time (sec): 34.16 - samples/sec: 3254.34 - lr: 0.000024 - momentum: 0.000000
158
+ 2023-10-13 17:52:45,579 epoch 6 - iter 584/738 - loss 0.01987479 - time (sec): 39.46 - samples/sec: 3280.26 - lr: 0.000023 - momentum: 0.000000
159
+ 2023-10-13 17:52:51,330 epoch 6 - iter 657/738 - loss 0.02178292 - time (sec): 45.21 - samples/sec: 3297.22 - lr: 0.000023 - momentum: 0.000000
160
+ 2023-10-13 17:52:56,145 epoch 6 - iter 730/738 - loss 0.02282192 - time (sec): 50.03 - samples/sec: 3300.32 - lr: 0.000022 - momentum: 0.000000
161
+ 2023-10-13 17:52:56,552 ----------------------------------------------------------------------------------------------------
162
+ 2023-10-13 17:52:56,552 EPOCH 6 done: loss 0.0228 - lr: 0.000022
163
+ 2023-10-13 17:53:07,779 DEV : loss 0.21827659010887146 - f1-score (micro avg) 0.7988
164
+ 2023-10-13 17:53:07,809 ----------------------------------------------------------------------------------------------------
165
+ 2023-10-13 17:53:12,402 epoch 7 - iter 73/738 - loss 0.01465345 - time (sec): 4.59 - samples/sec: 3353.90 - lr: 0.000022 - momentum: 0.000000
166
+ 2023-10-13 17:53:16,861 epoch 7 - iter 146/738 - loss 0.01573536 - time (sec): 9.05 - samples/sec: 3298.73 - lr: 0.000021 - momentum: 0.000000
167
+ 2023-10-13 17:53:21,855 epoch 7 - iter 219/738 - loss 0.01844721 - time (sec): 14.04 - samples/sec: 3360.00 - lr: 0.000021 - momentum: 0.000000
168
+ 2023-10-13 17:53:26,583 epoch 7 - iter 292/738 - loss 0.01755483 - time (sec): 18.77 - samples/sec: 3348.38 - lr: 0.000020 - momentum: 0.000000
169
+ 2023-10-13 17:53:31,556 epoch 7 - iter 365/738 - loss 0.01732233 - time (sec): 23.75 - samples/sec: 3348.92 - lr: 0.000020 - momentum: 0.000000
170
+ 2023-10-13 17:53:36,455 epoch 7 - iter 438/738 - loss 0.01872450 - time (sec): 28.64 - samples/sec: 3348.83 - lr: 0.000019 - momentum: 0.000000
171
+ 2023-10-13 17:53:41,246 epoch 7 - iter 511/738 - loss 0.01801696 - time (sec): 33.44 - samples/sec: 3353.96 - lr: 0.000018 - momentum: 0.000000
172
+ 2023-10-13 17:53:46,182 epoch 7 - iter 584/738 - loss 0.01924045 - time (sec): 38.37 - samples/sec: 3350.46 - lr: 0.000018 - momentum: 0.000000
173
+ 2023-10-13 17:53:51,818 epoch 7 - iter 657/738 - loss 0.01897246 - time (sec): 44.01 - samples/sec: 3363.15 - lr: 0.000017 - momentum: 0.000000
174
+ 2023-10-13 17:53:56,775 epoch 7 - iter 730/738 - loss 0.01878918 - time (sec): 48.96 - samples/sec: 3357.79 - lr: 0.000017 - momentum: 0.000000
175
+ 2023-10-13 17:53:57,373 ----------------------------------------------------------------------------------------------------
176
+ 2023-10-13 17:53:57,373 EPOCH 7 done: loss 0.0186 - lr: 0.000017
177
+ 2023-10-13 17:54:08,578 DEV : loss 0.20159663259983063 - f1-score (micro avg) 0.8255
178
+ 2023-10-13 17:54:08,607 saving best model
179
+ 2023-10-13 17:54:09,182 ----------------------------------------------------------------------------------------------------
180
+ 2023-10-13 17:54:14,384 epoch 8 - iter 73/738 - loss 0.00867283 - time (sec): 5.20 - samples/sec: 3376.39 - lr: 0.000016 - momentum: 0.000000
181
+ 2023-10-13 17:54:18,924 epoch 8 - iter 146/738 - loss 0.00929264 - time (sec): 9.74 - samples/sec: 3338.85 - lr: 0.000016 - momentum: 0.000000
182
+ 2023-10-13 17:54:23,900 epoch 8 - iter 219/738 - loss 0.00988928 - time (sec): 14.71 - samples/sec: 3360.24 - lr: 0.000015 - momentum: 0.000000
183
+ 2023-10-13 17:54:28,467 epoch 8 - iter 292/738 - loss 0.01078611 - time (sec): 19.28 - samples/sec: 3357.48 - lr: 0.000015 - momentum: 0.000000
184
+ 2023-10-13 17:54:33,531 epoch 8 - iter 365/738 - loss 0.01188262 - time (sec): 24.34 - samples/sec: 3332.34 - lr: 0.000014 - momentum: 0.000000
185
+ 2023-10-13 17:54:39,068 epoch 8 - iter 438/738 - loss 0.01168210 - time (sec): 29.88 - samples/sec: 3313.89 - lr: 0.000013 - momentum: 0.000000
186
+ 2023-10-13 17:54:43,328 epoch 8 - iter 511/738 - loss 0.01108505 - time (sec): 34.14 - samples/sec: 3334.40 - lr: 0.000013 - momentum: 0.000000
187
+ 2023-10-13 17:54:48,539 epoch 8 - iter 584/738 - loss 0.01139473 - time (sec): 39.35 - samples/sec: 3327.81 - lr: 0.000012 - momentum: 0.000000
188
+ 2023-10-13 17:54:53,175 epoch 8 - iter 657/738 - loss 0.01059925 - time (sec): 43.99 - samples/sec: 3333.14 - lr: 0.000012 - momentum: 0.000000
189
+ 2023-10-13 17:54:58,385 epoch 8 - iter 730/738 - loss 0.01213062 - time (sec): 49.20 - samples/sec: 3351.79 - lr: 0.000011 - momentum: 0.000000
190
+ 2023-10-13 17:54:58,847 ----------------------------------------------------------------------------------------------------
191
+ 2023-10-13 17:54:58,847 EPOCH 8 done: loss 0.0120 - lr: 0.000011
192
+ 2023-10-13 17:55:10,116 DEV : loss 0.2121274471282959 - f1-score (micro avg) 0.8167
193
+ 2023-10-13 17:55:10,146 ----------------------------------------------------------------------------------------------------
194
+ 2023-10-13 17:55:15,100 epoch 9 - iter 73/738 - loss 0.00785780 - time (sec): 4.95 - samples/sec: 3384.77 - lr: 0.000011 - momentum: 0.000000
195
+ 2023-10-13 17:55:20,217 epoch 9 - iter 146/738 - loss 0.00968859 - time (sec): 10.07 - samples/sec: 3323.66 - lr: 0.000010 - momentum: 0.000000
196
+ 2023-10-13 17:55:24,538 epoch 9 - iter 219/738 - loss 0.00766936 - time (sec): 14.39 - samples/sec: 3355.87 - lr: 0.000010 - momentum: 0.000000
197
+ 2023-10-13 17:55:29,150 epoch 9 - iter 292/738 - loss 0.00776847 - time (sec): 19.00 - samples/sec: 3343.63 - lr: 0.000009 - momentum: 0.000000
198
+ 2023-10-13 17:55:34,169 epoch 9 - iter 365/738 - loss 0.00786568 - time (sec): 24.02 - samples/sec: 3303.61 - lr: 0.000008 - momentum: 0.000000
199
+ 2023-10-13 17:55:39,513 epoch 9 - iter 438/738 - loss 0.00805319 - time (sec): 29.37 - samples/sec: 3304.31 - lr: 0.000008 - momentum: 0.000000
200
+ 2023-10-13 17:55:44,830 epoch 9 - iter 511/738 - loss 0.00739363 - time (sec): 34.68 - samples/sec: 3308.13 - lr: 0.000007 - momentum: 0.000000
201
+ 2023-10-13 17:55:49,338 epoch 9 - iter 584/738 - loss 0.00731685 - time (sec): 39.19 - samples/sec: 3324.15 - lr: 0.000007 - momentum: 0.000000
202
+ 2023-10-13 17:55:54,064 epoch 9 - iter 657/738 - loss 0.00765869 - time (sec): 43.92 - samples/sec: 3324.65 - lr: 0.000006 - momentum: 0.000000
203
+ 2023-10-13 17:55:59,126 epoch 9 - iter 730/738 - loss 0.00759718 - time (sec): 48.98 - samples/sec: 3359.39 - lr: 0.000006 - momentum: 0.000000
204
+ 2023-10-13 17:55:59,614 ----------------------------------------------------------------------------------------------------
205
+ 2023-10-13 17:55:59,614 EPOCH 9 done: loss 0.0075 - lr: 0.000006
206
+ 2023-10-13 17:56:10,875 DEV : loss 0.22374621033668518 - f1-score (micro avg) 0.8242
207
+ 2023-10-13 17:56:10,904 ----------------------------------------------------------------------------------------------------
208
+ 2023-10-13 17:56:16,195 epoch 10 - iter 73/738 - loss 0.00432633 - time (sec): 5.29 - samples/sec: 3017.62 - lr: 0.000005 - momentum: 0.000000
209
+ 2023-10-13 17:56:21,075 epoch 10 - iter 146/738 - loss 0.00341698 - time (sec): 10.17 - samples/sec: 3203.85 - lr: 0.000004 - momentum: 0.000000
210
+ 2023-10-13 17:56:25,457 epoch 10 - iter 219/738 - loss 0.00467938 - time (sec): 14.55 - samples/sec: 3261.52 - lr: 0.000004 - momentum: 0.000000
211
+ 2023-10-13 17:56:30,710 epoch 10 - iter 292/738 - loss 0.00480875 - time (sec): 19.81 - samples/sec: 3313.14 - lr: 0.000003 - momentum: 0.000000
212
+ 2023-10-13 17:56:36,255 epoch 10 - iter 365/738 - loss 0.00575497 - time (sec): 25.35 - samples/sec: 3311.60 - lr: 0.000003 - momentum: 0.000000
213
+ 2023-10-13 17:56:40,976 epoch 10 - iter 438/738 - loss 0.00574872 - time (sec): 30.07 - samples/sec: 3312.48 - lr: 0.000002 - momentum: 0.000000
214
+ 2023-10-13 17:56:45,946 epoch 10 - iter 511/738 - loss 0.00539595 - time (sec): 35.04 - samples/sec: 3329.58 - lr: 0.000002 - momentum: 0.000000
215
+ 2023-10-13 17:56:51,370 epoch 10 - iter 584/738 - loss 0.00518260 - time (sec): 40.47 - samples/sec: 3321.26 - lr: 0.000001 - momentum: 0.000000
216
+ 2023-10-13 17:56:56,106 epoch 10 - iter 657/738 - loss 0.00512442 - time (sec): 45.20 - samples/sec: 3321.97 - lr: 0.000001 - momentum: 0.000000
217
+ 2023-10-13 17:57:00,525 epoch 10 - iter 730/738 - loss 0.00499521 - time (sec): 49.62 - samples/sec: 3319.82 - lr: 0.000000 - momentum: 0.000000
218
+ 2023-10-13 17:57:00,998 ----------------------------------------------------------------------------------------------------
219
+ 2023-10-13 17:57:00,999 EPOCH 10 done: loss 0.0049 - lr: 0.000000
220
+ 2023-10-13 17:57:12,280 DEV : loss 0.22519326210021973 - f1-score (micro avg) 0.8266
221
+ 2023-10-13 17:57:12,310 saving best model
222
+ 2023-10-13 17:57:13,140 ----------------------------------------------------------------------------------------------------
223
+ 2023-10-13 17:57:13,141 Loading model from best epoch ...
224
+ 2023-10-13 17:57:14,542 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
225
+ 2023-10-13 17:57:20,591
226
+ Results:
227
+ - F-score (micro) 0.8013
228
+ - F-score (macro) 0.7071
229
+ - Accuracy 0.6949
230
+
231
+ By class:
232
+ precision recall f1-score support
233
+
234
+ loc 0.8622 0.8823 0.8721 858
235
+ pers 0.7549 0.7970 0.7754 537
236
+ org 0.5652 0.5909 0.5778 132
237
+ time 0.5484 0.6296 0.5862 54
238
+ prod 0.7636 0.6885 0.7241 61
239
+
240
+ micro avg 0.7876 0.8155 0.8013 1642
241
+ macro avg 0.6989 0.7177 0.7071 1642
242
+ weighted avg 0.7892 0.8155 0.8019 1642
243
+
244
+ 2023-10-13 17:57:20,591 ----------------------------------------------------------------------------------------------------