stefan-it commited on
Commit
4f66084
1 Parent(s): 94c8c3c

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. best-model.pt +3 -0
  2. dev.tsv +0 -0
  3. loss.tsv +11 -0
  4. test.tsv +0 -0
  5. training.log +243 -0
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1c5ffe75333cc4a60c5cc9da29d6149c8fd948969dc1576258b1c3f1e4606f39
3
+ size 443311111
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 09:55:49 0.0000 0.2833 0.1179 0.8053 0.6963 0.7468 0.6150
3
+ 2 09:57:07 0.0000 0.1077 0.1063 0.7057 0.7779 0.7400 0.6063
4
+ 3 09:58:24 0.0000 0.0739 0.1096 0.8357 0.7200 0.7736 0.6484
5
+ 4 09:59:43 0.0000 0.0579 0.1358 0.7944 0.7624 0.7781 0.6485
6
+ 5 10:01:02 0.0000 0.0432 0.1354 0.8430 0.7655 0.8024 0.6848
7
+ 6 10:02:20 0.0000 0.0337 0.1495 0.8247 0.7872 0.8055 0.6902
8
+ 7 10:03:38 0.0000 0.0218 0.1777 0.8522 0.7862 0.8178 0.7033
9
+ 8 10:04:56 0.0000 0.0154 0.1747 0.8709 0.7738 0.8195 0.7073
10
+ 9 10:06:13 0.0000 0.0098 0.1775 0.8786 0.7624 0.8164 0.7002
11
+ 10 10:07:31 0.0000 0.0068 0.1839 0.8766 0.7779 0.8243 0.7158
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,243 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-14 09:54:33,483 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-14 09:54:33,484 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-14 09:54:33,484 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-14 09:54:33,484 MultiCorpus: 5777 train + 722 dev + 723 test sentences
52
+ - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
53
+ 2023-10-14 09:54:33,484 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-14 09:54:33,484 Train: 5777 sentences
55
+ 2023-10-14 09:54:33,484 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-14 09:54:33,485 Training Params:
58
+ 2023-10-14 09:54:33,485 - learning_rate: "5e-05"
59
+ 2023-10-14 09:54:33,485 - mini_batch_size: "4"
60
+ 2023-10-14 09:54:33,485 - max_epochs: "10"
61
+ 2023-10-14 09:54:33,485 - shuffle: "True"
62
+ 2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-14 09:54:33,485 Plugins:
64
+ 2023-10-14 09:54:33,485 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-14 09:54:33,485 Final evaluation on model from best epoch (best-model.pt)
67
+ 2023-10-14 09:54:33,485 - metric: "('micro avg', 'f1-score')"
68
+ 2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
69
+ 2023-10-14 09:54:33,485 Computation:
70
+ 2023-10-14 09:54:33,485 - compute on device: cuda:0
71
+ 2023-10-14 09:54:33,485 - embedding storage: none
72
+ 2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-14 09:54:33,485 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
74
+ 2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
75
+ 2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-14 09:54:40,631 epoch 1 - iter 144/1445 - loss 1.32151919 - time (sec): 7.14 - samples/sec: 2426.87 - lr: 0.000005 - momentum: 0.000000
77
+ 2023-10-14 09:54:47,906 epoch 1 - iter 288/1445 - loss 0.79135229 - time (sec): 14.42 - samples/sec: 2414.73 - lr: 0.000010 - momentum: 0.000000
78
+ 2023-10-14 09:54:54,888 epoch 1 - iter 432/1445 - loss 0.58960751 - time (sec): 21.40 - samples/sec: 2424.36 - lr: 0.000015 - momentum: 0.000000
79
+ 2023-10-14 09:55:02,060 epoch 1 - iter 576/1445 - loss 0.49079879 - time (sec): 28.57 - samples/sec: 2430.02 - lr: 0.000020 - momentum: 0.000000
80
+ 2023-10-14 09:55:09,558 epoch 1 - iter 720/1445 - loss 0.41600711 - time (sec): 36.07 - samples/sec: 2465.79 - lr: 0.000025 - momentum: 0.000000
81
+ 2023-10-14 09:55:16,802 epoch 1 - iter 864/1445 - loss 0.37774428 - time (sec): 43.32 - samples/sec: 2443.10 - lr: 0.000030 - momentum: 0.000000
82
+ 2023-10-14 09:55:23,834 epoch 1 - iter 1008/1445 - loss 0.34717009 - time (sec): 50.35 - samples/sec: 2433.37 - lr: 0.000035 - momentum: 0.000000
83
+ 2023-10-14 09:55:31,157 epoch 1 - iter 1152/1445 - loss 0.32075520 - time (sec): 57.67 - samples/sec: 2436.34 - lr: 0.000040 - momentum: 0.000000
84
+ 2023-10-14 09:55:38,446 epoch 1 - iter 1296/1445 - loss 0.29995277 - time (sec): 64.96 - samples/sec: 2440.65 - lr: 0.000045 - momentum: 0.000000
85
+ 2023-10-14 09:55:45,688 epoch 1 - iter 1440/1445 - loss 0.28332814 - time (sec): 72.20 - samples/sec: 2436.27 - lr: 0.000050 - momentum: 0.000000
86
+ 2023-10-14 09:55:45,899 ----------------------------------------------------------------------------------------------------
87
+ 2023-10-14 09:55:45,900 EPOCH 1 done: loss 0.2833 - lr: 0.000050
88
+ 2023-10-14 09:55:49,489 DEV : loss 0.11788433790206909 - f1-score (micro avg) 0.7468
89
+ 2023-10-14 09:55:49,514 saving best model
90
+ 2023-10-14 09:55:49,872 ----------------------------------------------------------------------------------------------------
91
+ 2023-10-14 09:55:57,243 epoch 2 - iter 144/1445 - loss 0.13020671 - time (sec): 7.37 - samples/sec: 2424.95 - lr: 0.000049 - momentum: 0.000000
92
+ 2023-10-14 09:56:04,400 epoch 2 - iter 288/1445 - loss 0.12131136 - time (sec): 14.53 - samples/sec: 2414.49 - lr: 0.000049 - momentum: 0.000000
93
+ 2023-10-14 09:56:11,439 epoch 2 - iter 432/1445 - loss 0.11672384 - time (sec): 21.56 - samples/sec: 2425.04 - lr: 0.000048 - momentum: 0.000000
94
+ 2023-10-14 09:56:18,951 epoch 2 - iter 576/1445 - loss 0.11924226 - time (sec): 29.08 - samples/sec: 2436.14 - lr: 0.000048 - momentum: 0.000000
95
+ 2023-10-14 09:56:26,351 epoch 2 - iter 720/1445 - loss 0.11484030 - time (sec): 36.48 - samples/sec: 2450.06 - lr: 0.000047 - momentum: 0.000000
96
+ 2023-10-14 09:56:33,706 epoch 2 - iter 864/1445 - loss 0.10994458 - time (sec): 43.83 - samples/sec: 2433.06 - lr: 0.000047 - momentum: 0.000000
97
+ 2023-10-14 09:56:40,743 epoch 2 - iter 1008/1445 - loss 0.10976559 - time (sec): 50.87 - samples/sec: 2422.72 - lr: 0.000046 - momentum: 0.000000
98
+ 2023-10-14 09:56:48,032 epoch 2 - iter 1152/1445 - loss 0.10785653 - time (sec): 58.16 - samples/sec: 2425.32 - lr: 0.000046 - momentum: 0.000000
99
+ 2023-10-14 09:56:55,326 epoch 2 - iter 1296/1445 - loss 0.10867392 - time (sec): 65.45 - samples/sec: 2423.28 - lr: 0.000045 - momentum: 0.000000
100
+ 2023-10-14 09:57:02,445 epoch 2 - iter 1440/1445 - loss 0.10774871 - time (sec): 72.57 - samples/sec: 2421.80 - lr: 0.000044 - momentum: 0.000000
101
+ 2023-10-14 09:57:02,665 ----------------------------------------------------------------------------------------------------
102
+ 2023-10-14 09:57:02,666 EPOCH 2 done: loss 0.1077 - lr: 0.000044
103
+ 2023-10-14 09:57:07,129 DEV : loss 0.10633940249681473 - f1-score (micro avg) 0.74
104
+ 2023-10-14 09:57:07,146 ----------------------------------------------------------------------------------------------------
105
+ 2023-10-14 09:57:15,050 epoch 3 - iter 144/1445 - loss 0.09035149 - time (sec): 7.90 - samples/sec: 2218.66 - lr: 0.000044 - momentum: 0.000000
106
+ 2023-10-14 09:57:22,733 epoch 3 - iter 288/1445 - loss 0.08638552 - time (sec): 15.59 - samples/sec: 2237.53 - lr: 0.000043 - momentum: 0.000000
107
+ 2023-10-14 09:57:30,022 epoch 3 - iter 432/1445 - loss 0.07945969 - time (sec): 22.87 - samples/sec: 2313.62 - lr: 0.000043 - momentum: 0.000000
108
+ 2023-10-14 09:57:37,519 epoch 3 - iter 576/1445 - loss 0.07900069 - time (sec): 30.37 - samples/sec: 2328.85 - lr: 0.000042 - momentum: 0.000000
109
+ 2023-10-14 09:57:44,928 epoch 3 - iter 720/1445 - loss 0.07585140 - time (sec): 37.78 - samples/sec: 2331.64 - lr: 0.000042 - momentum: 0.000000
110
+ 2023-10-14 09:57:52,073 epoch 3 - iter 864/1445 - loss 0.07259362 - time (sec): 44.93 - samples/sec: 2350.70 - lr: 0.000041 - momentum: 0.000000
111
+ 2023-10-14 09:57:59,957 epoch 3 - iter 1008/1445 - loss 0.07342512 - time (sec): 52.81 - samples/sec: 2349.05 - lr: 0.000041 - momentum: 0.000000
112
+ 2023-10-14 09:58:06,869 epoch 3 - iter 1152/1445 - loss 0.07356550 - time (sec): 59.72 - samples/sec: 2355.16 - lr: 0.000040 - momentum: 0.000000
113
+ 2023-10-14 09:58:13,892 epoch 3 - iter 1296/1445 - loss 0.07388942 - time (sec): 66.74 - samples/sec: 2374.81 - lr: 0.000039 - momentum: 0.000000
114
+ 2023-10-14 09:58:20,897 epoch 3 - iter 1440/1445 - loss 0.07387594 - time (sec): 73.75 - samples/sec: 2382.74 - lr: 0.000039 - momentum: 0.000000
115
+ 2023-10-14 09:58:21,112 ----------------------------------------------------------------------------------------------------
116
+ 2023-10-14 09:58:21,112 EPOCH 3 done: loss 0.0739 - lr: 0.000039
117
+ 2023-10-14 09:58:24,726 DEV : loss 0.10955189168453217 - f1-score (micro avg) 0.7736
118
+ 2023-10-14 09:58:24,753 saving best model
119
+ 2023-10-14 09:58:25,237 ----------------------------------------------------------------------------------------------------
120
+ 2023-10-14 09:58:33,415 epoch 4 - iter 144/1445 - loss 0.05575292 - time (sec): 8.17 - samples/sec: 2111.86 - lr: 0.000038 - momentum: 0.000000
121
+ 2023-10-14 09:58:40,972 epoch 4 - iter 288/1445 - loss 0.05433728 - time (sec): 15.73 - samples/sec: 2209.10 - lr: 0.000038 - momentum: 0.000000
122
+ 2023-10-14 09:58:48,740 epoch 4 - iter 432/1445 - loss 0.05126783 - time (sec): 23.50 - samples/sec: 2174.54 - lr: 0.000037 - momentum: 0.000000
123
+ 2023-10-14 09:58:55,970 epoch 4 - iter 576/1445 - loss 0.05255992 - time (sec): 30.73 - samples/sec: 2256.80 - lr: 0.000037 - momentum: 0.000000
124
+ 2023-10-14 09:59:03,172 epoch 4 - iter 720/1445 - loss 0.05383510 - time (sec): 37.93 - samples/sec: 2293.78 - lr: 0.000036 - momentum: 0.000000
125
+ 2023-10-14 09:59:10,572 epoch 4 - iter 864/1445 - loss 0.05800646 - time (sec): 45.33 - samples/sec: 2327.58 - lr: 0.000036 - momentum: 0.000000
126
+ 2023-10-14 09:59:17,841 epoch 4 - iter 1008/1445 - loss 0.05787781 - time (sec): 52.60 - samples/sec: 2357.32 - lr: 0.000035 - momentum: 0.000000
127
+ 2023-10-14 09:59:24,960 epoch 4 - iter 1152/1445 - loss 0.05816878 - time (sec): 59.72 - samples/sec: 2351.87 - lr: 0.000034 - momentum: 0.000000
128
+ 2023-10-14 09:59:31,975 epoch 4 - iter 1296/1445 - loss 0.05645199 - time (sec): 66.73 - samples/sec: 2360.65 - lr: 0.000034 - momentum: 0.000000
129
+ 2023-10-14 09:59:39,299 epoch 4 - iter 1440/1445 - loss 0.05795378 - time (sec): 74.06 - samples/sec: 2374.45 - lr: 0.000033 - momentum: 0.000000
130
+ 2023-10-14 09:59:39,517 ----------------------------------------------------------------------------------------------------
131
+ 2023-10-14 09:59:39,517 EPOCH 4 done: loss 0.0579 - lr: 0.000033
132
+ 2023-10-14 09:59:43,265 DEV : loss 0.13582421839237213 - f1-score (micro avg) 0.7781
133
+ 2023-10-14 09:59:43,289 saving best model
134
+ 2023-10-14 09:59:43,862 ----------------------------------------------------------------------------------------------------
135
+ 2023-10-14 09:59:52,276 epoch 5 - iter 144/1445 - loss 0.03810618 - time (sec): 8.41 - samples/sec: 2223.59 - lr: 0.000033 - momentum: 0.000000
136
+ 2023-10-14 10:00:00,013 epoch 5 - iter 288/1445 - loss 0.03940126 - time (sec): 16.15 - samples/sec: 2221.95 - lr: 0.000032 - momentum: 0.000000
137
+ 2023-10-14 10:00:07,583 epoch 5 - iter 432/1445 - loss 0.04118564 - time (sec): 23.72 - samples/sec: 2276.97 - lr: 0.000032 - momentum: 0.000000
138
+ 2023-10-14 10:00:14,965 epoch 5 - iter 576/1445 - loss 0.04133400 - time (sec): 31.10 - samples/sec: 2304.36 - lr: 0.000031 - momentum: 0.000000
139
+ 2023-10-14 10:00:22,239 epoch 5 - iter 720/1445 - loss 0.04096234 - time (sec): 38.37 - samples/sec: 2322.89 - lr: 0.000031 - momentum: 0.000000
140
+ 2023-10-14 10:00:29,460 epoch 5 - iter 864/1445 - loss 0.04169367 - time (sec): 45.60 - samples/sec: 2339.62 - lr: 0.000030 - momentum: 0.000000
141
+ 2023-10-14 10:00:36,493 epoch 5 - iter 1008/1445 - loss 0.04137275 - time (sec): 52.63 - samples/sec: 2344.75 - lr: 0.000029 - momentum: 0.000000
142
+ 2023-10-14 10:00:43,616 epoch 5 - iter 1152/1445 - loss 0.04174783 - time (sec): 59.75 - samples/sec: 2357.54 - lr: 0.000029 - momentum: 0.000000
143
+ 2023-10-14 10:00:50,767 epoch 5 - iter 1296/1445 - loss 0.04150397 - time (sec): 66.90 - samples/sec: 2368.59 - lr: 0.000028 - momentum: 0.000000
144
+ 2023-10-14 10:00:58,092 epoch 5 - iter 1440/1445 - loss 0.04328645 - time (sec): 74.23 - samples/sec: 2366.61 - lr: 0.000028 - momentum: 0.000000
145
+ 2023-10-14 10:00:58,319 ----------------------------------------------------------------------------------------------------
146
+ 2023-10-14 10:00:58,319 EPOCH 5 done: loss 0.0432 - lr: 0.000028
147
+ 2023-10-14 10:01:02,496 DEV : loss 0.13541918992996216 - f1-score (micro avg) 0.8024
148
+ 2023-10-14 10:01:02,522 saving best model
149
+ 2023-10-14 10:01:03,190 ----------------------------------------------------------------------------------------------------
150
+ 2023-10-14 10:01:11,193 epoch 6 - iter 144/1445 - loss 0.02701007 - time (sec): 8.00 - samples/sec: 2186.87 - lr: 0.000027 - momentum: 0.000000
151
+ 2023-10-14 10:01:19,114 epoch 6 - iter 288/1445 - loss 0.02878942 - time (sec): 15.92 - samples/sec: 2281.52 - lr: 0.000027 - momentum: 0.000000
152
+ 2023-10-14 10:01:26,397 epoch 6 - iter 432/1445 - loss 0.02954965 - time (sec): 23.21 - samples/sec: 2310.29 - lr: 0.000026 - momentum: 0.000000
153
+ 2023-10-14 10:01:33,678 epoch 6 - iter 576/1445 - loss 0.03123260 - time (sec): 30.49 - samples/sec: 2330.56 - lr: 0.000026 - momentum: 0.000000
154
+ 2023-10-14 10:01:40,900 epoch 6 - iter 720/1445 - loss 0.03333774 - time (sec): 37.71 - samples/sec: 2343.36 - lr: 0.000025 - momentum: 0.000000
155
+ 2023-10-14 10:01:48,262 epoch 6 - iter 864/1445 - loss 0.03315317 - time (sec): 45.07 - samples/sec: 2370.89 - lr: 0.000024 - momentum: 0.000000
156
+ 2023-10-14 10:01:55,480 epoch 6 - iter 1008/1445 - loss 0.03407127 - time (sec): 52.29 - samples/sec: 2374.76 - lr: 0.000024 - momentum: 0.000000
157
+ 2023-10-14 10:02:02,421 epoch 6 - iter 1152/1445 - loss 0.03360862 - time (sec): 59.23 - samples/sec: 2376.83 - lr: 0.000023 - momentum: 0.000000
158
+ 2023-10-14 10:02:09,545 epoch 6 - iter 1296/1445 - loss 0.03318720 - time (sec): 66.35 - samples/sec: 2369.79 - lr: 0.000023 - momentum: 0.000000
159
+ 2023-10-14 10:02:16,875 epoch 6 - iter 1440/1445 - loss 0.03384080 - time (sec): 73.68 - samples/sec: 2382.28 - lr: 0.000022 - momentum: 0.000000
160
+ 2023-10-14 10:02:17,142 ----------------------------------------------------------------------------------------------------
161
+ 2023-10-14 10:02:17,142 EPOCH 6 done: loss 0.0337 - lr: 0.000022
162
+ 2023-10-14 10:02:20,761 DEV : loss 0.14951762557029724 - f1-score (micro avg) 0.8055
163
+ 2023-10-14 10:02:20,782 saving best model
164
+ 2023-10-14 10:02:21,326 ----------------------------------------------------------------------------------------------------
165
+ 2023-10-14 10:02:28,530 epoch 7 - iter 144/1445 - loss 0.01859679 - time (sec): 7.20 - samples/sec: 2409.56 - lr: 0.000022 - momentum: 0.000000
166
+ 2023-10-14 10:02:36,495 epoch 7 - iter 288/1445 - loss 0.01862360 - time (sec): 15.17 - samples/sec: 2313.65 - lr: 0.000021 - momentum: 0.000000
167
+ 2023-10-14 10:02:43,691 epoch 7 - iter 432/1445 - loss 0.02122236 - time (sec): 22.36 - samples/sec: 2339.13 - lr: 0.000021 - momentum: 0.000000
168
+ 2023-10-14 10:02:51,023 epoch 7 - iter 576/1445 - loss 0.02018797 - time (sec): 29.69 - samples/sec: 2366.08 - lr: 0.000020 - momentum: 0.000000
169
+ 2023-10-14 10:02:58,172 epoch 7 - iter 720/1445 - loss 0.02085339 - time (sec): 36.84 - samples/sec: 2377.25 - lr: 0.000019 - momentum: 0.000000
170
+ 2023-10-14 10:03:05,594 epoch 7 - iter 864/1445 - loss 0.02188189 - time (sec): 44.27 - samples/sec: 2386.72 - lr: 0.000019 - momentum: 0.000000
171
+ 2023-10-14 10:03:12,809 epoch 7 - iter 1008/1445 - loss 0.02164479 - time (sec): 51.48 - samples/sec: 2388.79 - lr: 0.000018 - momentum: 0.000000
172
+ 2023-10-14 10:03:19,872 epoch 7 - iter 1152/1445 - loss 0.02215737 - time (sec): 58.54 - samples/sec: 2389.64 - lr: 0.000018 - momentum: 0.000000
173
+ 2023-10-14 10:03:27,552 epoch 7 - iter 1296/1445 - loss 0.02159639 - time (sec): 66.22 - samples/sec: 2387.54 - lr: 0.000017 - momentum: 0.000000
174
+ 2023-10-14 10:03:34,922 epoch 7 - iter 1440/1445 - loss 0.02175986 - time (sec): 73.59 - samples/sec: 2388.59 - lr: 0.000017 - momentum: 0.000000
175
+ 2023-10-14 10:03:35,171 ----------------------------------------------------------------------------------------------------
176
+ 2023-10-14 10:03:35,172 EPOCH 7 done: loss 0.0218 - lr: 0.000017
177
+ 2023-10-14 10:03:38,829 DEV : loss 0.1777421534061432 - f1-score (micro avg) 0.8178
178
+ 2023-10-14 10:03:38,851 saving best model
179
+ 2023-10-14 10:03:39,474 ----------------------------------------------------------------------------------------------------
180
+ 2023-10-14 10:03:46,871 epoch 8 - iter 144/1445 - loss 0.01548083 - time (sec): 7.40 - samples/sec: 2445.84 - lr: 0.000016 - momentum: 0.000000
181
+ 2023-10-14 10:03:53,941 epoch 8 - iter 288/1445 - loss 0.01427568 - time (sec): 14.47 - samples/sec: 2431.92 - lr: 0.000016 - momentum: 0.000000
182
+ 2023-10-14 10:04:01,854 epoch 8 - iter 432/1445 - loss 0.01655715 - time (sec): 22.38 - samples/sec: 2449.47 - lr: 0.000015 - momentum: 0.000000
183
+ 2023-10-14 10:04:08,682 epoch 8 - iter 576/1445 - loss 0.01523833 - time (sec): 29.21 - samples/sec: 2379.70 - lr: 0.000014 - momentum: 0.000000
184
+ 2023-10-14 10:04:16,110 epoch 8 - iter 720/1445 - loss 0.01549776 - time (sec): 36.63 - samples/sec: 2408.93 - lr: 0.000014 - momentum: 0.000000
185
+ 2023-10-14 10:04:23,692 epoch 8 - iter 864/1445 - loss 0.01513458 - time (sec): 44.22 - samples/sec: 2413.23 - lr: 0.000013 - momentum: 0.000000
186
+ 2023-10-14 10:04:30,915 epoch 8 - iter 1008/1445 - loss 0.01430193 - time (sec): 51.44 - samples/sec: 2407.19 - lr: 0.000013 - momentum: 0.000000
187
+ 2023-10-14 10:04:38,088 epoch 8 - iter 1152/1445 - loss 0.01514747 - time (sec): 58.61 - samples/sec: 2405.46 - lr: 0.000012 - momentum: 0.000000
188
+ 2023-10-14 10:04:45,354 epoch 8 - iter 1296/1445 - loss 0.01503901 - time (sec): 65.88 - samples/sec: 2412.79 - lr: 0.000012 - momentum: 0.000000
189
+ 2023-10-14 10:04:52,563 epoch 8 - iter 1440/1445 - loss 0.01540928 - time (sec): 73.09 - samples/sec: 2404.82 - lr: 0.000011 - momentum: 0.000000
190
+ 2023-10-14 10:04:52,787 ----------------------------------------------------------------------------------------------------
191
+ 2023-10-14 10:04:52,787 EPOCH 8 done: loss 0.0154 - lr: 0.000011
192
+ 2023-10-14 10:04:56,876 DEV : loss 0.17469239234924316 - f1-score (micro avg) 0.8195
193
+ 2023-10-14 10:04:56,901 saving best model
194
+ 2023-10-14 10:04:57,410 ----------------------------------------------------------------------------------------------------
195
+ 2023-10-14 10:05:04,739 epoch 9 - iter 144/1445 - loss 0.00719715 - time (sec): 7.32 - samples/sec: 2462.89 - lr: 0.000011 - momentum: 0.000000
196
+ 2023-10-14 10:05:11,984 epoch 9 - iter 288/1445 - loss 0.00849542 - time (sec): 14.57 - samples/sec: 2445.28 - lr: 0.000010 - momentum: 0.000000
197
+ 2023-10-14 10:05:19,098 epoch 9 - iter 432/1445 - loss 0.00729865 - time (sec): 21.68 - samples/sec: 2425.52 - lr: 0.000009 - momentum: 0.000000
198
+ 2023-10-14 10:05:26,573 epoch 9 - iter 576/1445 - loss 0.00850589 - time (sec): 29.15 - samples/sec: 2435.29 - lr: 0.000009 - momentum: 0.000000
199
+ 2023-10-14 10:05:33,750 epoch 9 - iter 720/1445 - loss 0.00957931 - time (sec): 36.33 - samples/sec: 2419.21 - lr: 0.000008 - momentum: 0.000000
200
+ 2023-10-14 10:05:41,317 epoch 9 - iter 864/1445 - loss 0.00998317 - time (sec): 43.90 - samples/sec: 2439.16 - lr: 0.000008 - momentum: 0.000000
201
+ 2023-10-14 10:05:48,361 epoch 9 - iter 1008/1445 - loss 0.00949987 - time (sec): 50.94 - samples/sec: 2424.28 - lr: 0.000007 - momentum: 0.000000
202
+ 2023-10-14 10:05:55,632 epoch 9 - iter 1152/1445 - loss 0.00972046 - time (sec): 58.21 - samples/sec: 2432.51 - lr: 0.000007 - momentum: 0.000000
203
+ 2023-10-14 10:06:02,678 epoch 9 - iter 1296/1445 - loss 0.01012022 - time (sec): 65.26 - samples/sec: 2429.75 - lr: 0.000006 - momentum: 0.000000
204
+ 2023-10-14 10:06:09,749 epoch 9 - iter 1440/1445 - loss 0.00980864 - time (sec): 72.33 - samples/sec: 2431.32 - lr: 0.000006 - momentum: 0.000000
205
+ 2023-10-14 10:06:09,984 ----------------------------------------------------------------------------------------------------
206
+ 2023-10-14 10:06:09,984 EPOCH 9 done: loss 0.0098 - lr: 0.000006
207
+ 2023-10-14 10:06:13,690 DEV : loss 0.17753612995147705 - f1-score (micro avg) 0.8164
208
+ 2023-10-14 10:06:13,708 ----------------------------------------------------------------------------------------------------
209
+ 2023-10-14 10:06:20,952 epoch 10 - iter 144/1445 - loss 0.00578931 - time (sec): 7.24 - samples/sec: 2299.68 - lr: 0.000005 - momentum: 0.000000
210
+ 2023-10-14 10:06:28,812 epoch 10 - iter 288/1445 - loss 0.00515456 - time (sec): 15.10 - samples/sec: 2349.36 - lr: 0.000004 - momentum: 0.000000
211
+ 2023-10-14 10:06:36,007 epoch 10 - iter 432/1445 - loss 0.00857074 - time (sec): 22.30 - samples/sec: 2386.36 - lr: 0.000004 - momentum: 0.000000
212
+ 2023-10-14 10:06:43,072 epoch 10 - iter 576/1445 - loss 0.00818790 - time (sec): 29.36 - samples/sec: 2377.72 - lr: 0.000003 - momentum: 0.000000
213
+ 2023-10-14 10:06:50,686 epoch 10 - iter 720/1445 - loss 0.00811613 - time (sec): 36.98 - samples/sec: 2386.90 - lr: 0.000003 - momentum: 0.000000
214
+ 2023-10-14 10:06:58,142 epoch 10 - iter 864/1445 - loss 0.00732934 - time (sec): 44.43 - samples/sec: 2373.05 - lr: 0.000002 - momentum: 0.000000
215
+ 2023-10-14 10:07:05,597 epoch 10 - iter 1008/1445 - loss 0.00736235 - time (sec): 51.89 - samples/sec: 2389.94 - lr: 0.000002 - momentum: 0.000000
216
+ 2023-10-14 10:07:12,759 epoch 10 - iter 1152/1445 - loss 0.00677836 - time (sec): 59.05 - samples/sec: 2393.33 - lr: 0.000001 - momentum: 0.000000
217
+ 2023-10-14 10:07:19,810 epoch 10 - iter 1296/1445 - loss 0.00657795 - time (sec): 66.10 - samples/sec: 2392.12 - lr: 0.000001 - momentum: 0.000000
218
+ 2023-10-14 10:07:27,162 epoch 10 - iter 1440/1445 - loss 0.00682360 - time (sec): 73.45 - samples/sec: 2388.96 - lr: 0.000000 - momentum: 0.000000
219
+ 2023-10-14 10:07:27,463 ----------------------------------------------------------------------------------------------------
220
+ 2023-10-14 10:07:27,464 EPOCH 10 done: loss 0.0068 - lr: 0.000000
221
+ 2023-10-14 10:07:31,044 DEV : loss 0.18386265635490417 - f1-score (micro avg) 0.8243
222
+ 2023-10-14 10:07:31,069 saving best model
223
+ 2023-10-14 10:07:32,182 ----------------------------------------------------------------------------------------------------
224
+ 2023-10-14 10:07:32,184 Loading model from best epoch ...
225
+ 2023-10-14 10:07:33,955 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
226
+ 2023-10-14 10:07:37,400
227
+ Results:
228
+ - F-score (micro) 0.818
229
+ - F-score (macro) 0.7206
230
+ - Accuracy 0.7037
231
+
232
+ By class:
233
+ precision recall f1-score support
234
+
235
+ PER 0.8142 0.8091 0.8117 482
236
+ LOC 0.9071 0.8319 0.8679 458
237
+ ORG 0.6279 0.3913 0.4821 69
238
+
239
+ micro avg 0.8471 0.7909 0.8180 1009
240
+ macro avg 0.7831 0.6774 0.7206 1009
241
+ weighted avg 0.8436 0.7909 0.8146 1009
242
+
243
+ 2023-10-14 10:07:37,400 ----------------------------------------------------------------------------------------------------