stefan-it commited on
Commit
25f1529
1 Parent(s): 9abf9ab

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. best-model.pt +3 -0
  2. dev.tsv +0 -0
  3. loss.tsv +11 -0
  4. test.tsv +0 -0
  5. training.log +240 -0
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a5c65f17b88cfc113ad3f9ec884ab163f22654a785e89fef5c8a5fa18b186db2
3
+ size 443311111
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 08:29:52 0.0000 0.4157 0.1674 0.6029 0.4690 0.5276 0.3768
3
+ 2 08:30:56 0.0000 0.1160 0.0996 0.7977 0.7128 0.7529 0.6205
4
+ 3 08:31:59 0.0000 0.0694 0.0921 0.8118 0.7397 0.7741 0.6503
5
+ 4 08:33:04 0.0000 0.0459 0.0935 0.8191 0.7624 0.7897 0.6655
6
+ 5 08:34:09 0.0000 0.0327 0.1094 0.8128 0.7986 0.8056 0.6883
7
+ 6 08:35:14 0.0000 0.0264 0.1391 0.7768 0.7944 0.7855 0.6635
8
+ 7 08:36:18 0.0000 0.0185 0.1708 0.8293 0.7831 0.8055 0.6860
9
+ 8 08:37:23 0.0000 0.0152 0.1741 0.8190 0.7758 0.7968 0.6766
10
+ 9 08:38:27 0.0000 0.0107 0.1968 0.8199 0.7758 0.7972 0.6766
11
+ 10 08:39:30 0.0000 0.0088 0.2016 0.8318 0.7665 0.7978 0.6764
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,240 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-14 08:28:49,609 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-14 08:28:49,610 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-14 08:28:49,610 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-14 08:28:49,610 MultiCorpus: 5777 train + 722 dev + 723 test sentences
52
+ - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
53
+ 2023-10-14 08:28:49,610 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-14 08:28:49,610 Train: 5777 sentences
55
+ 2023-10-14 08:28:49,610 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-14 08:28:49,610 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-14 08:28:49,610 Training Params:
58
+ 2023-10-14 08:28:49,610 - learning_rate: "3e-05"
59
+ 2023-10-14 08:28:49,610 - mini_batch_size: "8"
60
+ 2023-10-14 08:28:49,610 - max_epochs: "10"
61
+ 2023-10-14 08:28:49,610 - shuffle: "True"
62
+ 2023-10-14 08:28:49,611 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-14 08:28:49,611 Plugins:
64
+ 2023-10-14 08:28:49,611 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2023-10-14 08:28:49,611 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-14 08:28:49,611 Final evaluation on model from best epoch (best-model.pt)
67
+ 2023-10-14 08:28:49,611 - metric: "('micro avg', 'f1-score')"
68
+ 2023-10-14 08:28:49,611 ----------------------------------------------------------------------------------------------------
69
+ 2023-10-14 08:28:49,611 Computation:
70
+ 2023-10-14 08:28:49,611 - compute on device: cuda:0
71
+ 2023-10-14 08:28:49,611 - embedding storage: none
72
+ 2023-10-14 08:28:49,611 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-14 08:28:49,611 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
74
+ 2023-10-14 08:28:49,611 ----------------------------------------------------------------------------------------------------
75
+ 2023-10-14 08:28:49,611 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-14 08:28:55,399 epoch 1 - iter 72/723 - loss 2.34500859 - time (sec): 5.79 - samples/sec: 2928.05 - lr: 0.000003 - momentum: 0.000000
77
+ 2023-10-14 08:29:01,032 epoch 1 - iter 144/723 - loss 1.40146179 - time (sec): 11.42 - samples/sec: 2958.85 - lr: 0.000006 - momentum: 0.000000
78
+ 2023-10-14 08:29:07,150 epoch 1 - iter 216/723 - loss 0.98472208 - time (sec): 17.54 - samples/sec: 2978.52 - lr: 0.000009 - momentum: 0.000000
79
+ 2023-10-14 08:29:12,983 epoch 1 - iter 288/723 - loss 0.78948184 - time (sec): 23.37 - samples/sec: 2988.51 - lr: 0.000012 - momentum: 0.000000
80
+ 2023-10-14 08:29:18,706 epoch 1 - iter 360/723 - loss 0.67269994 - time (sec): 29.09 - samples/sec: 2981.53 - lr: 0.000015 - momentum: 0.000000
81
+ 2023-10-14 08:29:24,406 epoch 1 - iter 432/723 - loss 0.60136728 - time (sec): 34.79 - samples/sec: 2938.07 - lr: 0.000018 - momentum: 0.000000
82
+ 2023-10-14 08:29:30,603 epoch 1 - iter 504/723 - loss 0.53425447 - time (sec): 40.99 - samples/sec: 2947.40 - lr: 0.000021 - momentum: 0.000000
83
+ 2023-10-14 08:29:37,056 epoch 1 - iter 576/723 - loss 0.48656421 - time (sec): 47.44 - samples/sec: 2926.64 - lr: 0.000024 - momentum: 0.000000
84
+ 2023-10-14 08:29:43,158 epoch 1 - iter 648/723 - loss 0.44962888 - time (sec): 53.55 - samples/sec: 2930.20 - lr: 0.000027 - momentum: 0.000000
85
+ 2023-10-14 08:29:49,227 epoch 1 - iter 720/723 - loss 0.41686927 - time (sec): 59.62 - samples/sec: 2944.90 - lr: 0.000030 - momentum: 0.000000
86
+ 2023-10-14 08:29:49,488 ----------------------------------------------------------------------------------------------------
87
+ 2023-10-14 08:29:49,488 EPOCH 1 done: loss 0.4157 - lr: 0.000030
88
+ 2023-10-14 08:29:52,663 DEV : loss 0.16736090183258057 - f1-score (micro avg) 0.5276
89
+ 2023-10-14 08:29:52,698 saving best model
90
+ 2023-10-14 08:29:53,052 ----------------------------------------------------------------------------------------------------
91
+ 2023-10-14 08:29:58,787 epoch 2 - iter 72/723 - loss 0.15097610 - time (sec): 5.73 - samples/sec: 3028.16 - lr: 0.000030 - momentum: 0.000000
92
+ 2023-10-14 08:30:04,682 epoch 2 - iter 144/723 - loss 0.13684182 - time (sec): 11.63 - samples/sec: 2960.60 - lr: 0.000029 - momentum: 0.000000
93
+ 2023-10-14 08:30:10,736 epoch 2 - iter 216/723 - loss 0.13585466 - time (sec): 17.68 - samples/sec: 2945.67 - lr: 0.000029 - momentum: 0.000000
94
+ 2023-10-14 08:30:16,269 epoch 2 - iter 288/723 - loss 0.12788984 - time (sec): 23.22 - samples/sec: 2985.08 - lr: 0.000029 - momentum: 0.000000
95
+ 2023-10-14 08:30:22,637 epoch 2 - iter 360/723 - loss 0.12683567 - time (sec): 29.58 - samples/sec: 2963.23 - lr: 0.000028 - momentum: 0.000000
96
+ 2023-10-14 08:30:28,367 epoch 2 - iter 432/723 - loss 0.12355013 - time (sec): 35.31 - samples/sec: 2965.98 - lr: 0.000028 - momentum: 0.000000
97
+ 2023-10-14 08:30:34,488 epoch 2 - iter 504/723 - loss 0.12414425 - time (sec): 41.44 - samples/sec: 2962.13 - lr: 0.000028 - momentum: 0.000000
98
+ 2023-10-14 08:30:39,830 epoch 2 - iter 576/723 - loss 0.12115131 - time (sec): 46.78 - samples/sec: 2963.61 - lr: 0.000027 - momentum: 0.000000
99
+ 2023-10-14 08:30:46,149 epoch 2 - iter 648/723 - loss 0.11808836 - time (sec): 53.10 - samples/sec: 2972.27 - lr: 0.000027 - momentum: 0.000000
100
+ 2023-10-14 08:30:52,188 epoch 2 - iter 720/723 - loss 0.11598087 - time (sec): 59.14 - samples/sec: 2970.40 - lr: 0.000027 - momentum: 0.000000
101
+ 2023-10-14 08:30:52,397 ----------------------------------------------------------------------------------------------------
102
+ 2023-10-14 08:30:52,397 EPOCH 2 done: loss 0.1160 - lr: 0.000027
103
+ 2023-10-14 08:30:56,289 DEV : loss 0.09956270456314087 - f1-score (micro avg) 0.7529
104
+ 2023-10-14 08:30:56,304 saving best model
105
+ 2023-10-14 08:30:56,754 ----------------------------------------------------------------------------------------------------
106
+ 2023-10-14 08:31:02,841 epoch 3 - iter 72/723 - loss 0.07480008 - time (sec): 6.09 - samples/sec: 2911.61 - lr: 0.000026 - momentum: 0.000000
107
+ 2023-10-14 08:31:08,895 epoch 3 - iter 144/723 - loss 0.06907493 - time (sec): 12.14 - samples/sec: 2929.73 - lr: 0.000026 - momentum: 0.000000
108
+ 2023-10-14 08:31:14,786 epoch 3 - iter 216/723 - loss 0.07126848 - time (sec): 18.03 - samples/sec: 2950.59 - lr: 0.000026 - momentum: 0.000000
109
+ 2023-10-14 08:31:20,680 epoch 3 - iter 288/723 - loss 0.06958853 - time (sec): 23.92 - samples/sec: 2960.31 - lr: 0.000025 - momentum: 0.000000
110
+ 2023-10-14 08:31:26,523 epoch 3 - iter 360/723 - loss 0.06902856 - time (sec): 29.77 - samples/sec: 2974.29 - lr: 0.000025 - momentum: 0.000000
111
+ 2023-10-14 08:31:32,101 epoch 3 - iter 432/723 - loss 0.06866383 - time (sec): 35.35 - samples/sec: 3001.94 - lr: 0.000025 - momentum: 0.000000
112
+ 2023-10-14 08:31:38,269 epoch 3 - iter 504/723 - loss 0.07120681 - time (sec): 41.51 - samples/sec: 2963.35 - lr: 0.000024 - momentum: 0.000000
113
+ 2023-10-14 08:31:44,320 epoch 3 - iter 576/723 - loss 0.06999573 - time (sec): 47.56 - samples/sec: 2969.47 - lr: 0.000024 - momentum: 0.000000
114
+ 2023-10-14 08:31:50,039 epoch 3 - iter 648/723 - loss 0.06994277 - time (sec): 53.28 - samples/sec: 2982.09 - lr: 0.000024 - momentum: 0.000000
115
+ 2023-10-14 08:31:56,214 epoch 3 - iter 720/723 - loss 0.06941747 - time (sec): 59.46 - samples/sec: 2956.51 - lr: 0.000023 - momentum: 0.000000
116
+ 2023-10-14 08:31:56,392 ----------------------------------------------------------------------------------------------------
117
+ 2023-10-14 08:31:56,392 EPOCH 3 done: loss 0.0694 - lr: 0.000023
118
+ 2023-10-14 08:31:59,885 DEV : loss 0.09209852665662766 - f1-score (micro avg) 0.7741
119
+ 2023-10-14 08:31:59,909 saving best model
120
+ 2023-10-14 08:32:00,444 ----------------------------------------------------------------------------------------------------
121
+ 2023-10-14 08:32:07,148 epoch 4 - iter 72/723 - loss 0.04111019 - time (sec): 6.70 - samples/sec: 2691.45 - lr: 0.000023 - momentum: 0.000000
122
+ 2023-10-14 08:32:13,780 epoch 4 - iter 144/723 - loss 0.03743022 - time (sec): 13.33 - samples/sec: 2745.82 - lr: 0.000023 - momentum: 0.000000
123
+ 2023-10-14 08:32:19,590 epoch 4 - iter 216/723 - loss 0.04032334 - time (sec): 19.14 - samples/sec: 2778.83 - lr: 0.000022 - momentum: 0.000000
124
+ 2023-10-14 08:32:25,877 epoch 4 - iter 288/723 - loss 0.04209697 - time (sec): 25.43 - samples/sec: 2803.23 - lr: 0.000022 - momentum: 0.000000
125
+ 2023-10-14 08:32:31,717 epoch 4 - iter 360/723 - loss 0.04334243 - time (sec): 31.27 - samples/sec: 2834.19 - lr: 0.000022 - momentum: 0.000000
126
+ 2023-10-14 08:32:37,415 epoch 4 - iter 432/723 - loss 0.04378885 - time (sec): 36.97 - samples/sec: 2846.93 - lr: 0.000021 - momentum: 0.000000
127
+ 2023-10-14 08:32:43,029 epoch 4 - iter 504/723 - loss 0.04269892 - time (sec): 42.58 - samples/sec: 2864.00 - lr: 0.000021 - momentum: 0.000000
128
+ 2023-10-14 08:32:49,370 epoch 4 - iter 576/723 - loss 0.04332948 - time (sec): 48.92 - samples/sec: 2871.55 - lr: 0.000021 - momentum: 0.000000
129
+ 2023-10-14 08:32:55,288 epoch 4 - iter 648/723 - loss 0.04450354 - time (sec): 54.84 - samples/sec: 2867.88 - lr: 0.000020 - momentum: 0.000000
130
+ 2023-10-14 08:33:01,292 epoch 4 - iter 720/723 - loss 0.04532438 - time (sec): 60.84 - samples/sec: 2888.17 - lr: 0.000020 - momentum: 0.000000
131
+ 2023-10-14 08:33:01,510 ----------------------------------------------------------------------------------------------------
132
+ 2023-10-14 08:33:01,510 EPOCH 4 done: loss 0.0459 - lr: 0.000020
133
+ 2023-10-14 08:33:04,963 DEV : loss 0.09347887337207794 - f1-score (micro avg) 0.7897
134
+ 2023-10-14 08:33:04,979 saving best model
135
+ 2023-10-14 08:33:05,498 ----------------------------------------------------------------------------------------------------
136
+ 2023-10-14 08:33:11,770 epoch 5 - iter 72/723 - loss 0.02961405 - time (sec): 6.27 - samples/sec: 2830.72 - lr: 0.000020 - momentum: 0.000000
137
+ 2023-10-14 08:33:17,831 epoch 5 - iter 144/723 - loss 0.03413594 - time (sec): 12.33 - samples/sec: 2887.48 - lr: 0.000019 - momentum: 0.000000
138
+ 2023-10-14 08:33:23,316 epoch 5 - iter 216/723 - loss 0.03172083 - time (sec): 17.82 - samples/sec: 2945.68 - lr: 0.000019 - momentum: 0.000000
139
+ 2023-10-14 08:33:29,114 epoch 5 - iter 288/723 - loss 0.03207149 - time (sec): 23.62 - samples/sec: 2961.38 - lr: 0.000019 - momentum: 0.000000
140
+ 2023-10-14 08:33:34,760 epoch 5 - iter 360/723 - loss 0.03130345 - time (sec): 29.26 - samples/sec: 2995.97 - lr: 0.000018 - momentum: 0.000000
141
+ 2023-10-14 08:33:41,220 epoch 5 - iter 432/723 - loss 0.03111424 - time (sec): 35.72 - samples/sec: 2964.12 - lr: 0.000018 - momentum: 0.000000
142
+ 2023-10-14 08:33:46,931 epoch 5 - iter 504/723 - loss 0.03173225 - time (sec): 41.43 - samples/sec: 2962.14 - lr: 0.000018 - momentum: 0.000000
143
+ 2023-10-14 08:33:52,899 epoch 5 - iter 576/723 - loss 0.03178811 - time (sec): 47.40 - samples/sec: 2966.06 - lr: 0.000017 - momentum: 0.000000
144
+ 2023-10-14 08:33:59,266 epoch 5 - iter 648/723 - loss 0.03309214 - time (sec): 53.77 - samples/sec: 2947.98 - lr: 0.000017 - momentum: 0.000000
145
+ 2023-10-14 08:34:04,879 epoch 5 - iter 720/723 - loss 0.03278333 - time (sec): 59.38 - samples/sec: 2955.61 - lr: 0.000017 - momentum: 0.000000
146
+ 2023-10-14 08:34:05,223 ----------------------------------------------------------------------------------------------------
147
+ 2023-10-14 08:34:05,223 EPOCH 5 done: loss 0.0327 - lr: 0.000017
148
+ 2023-10-14 08:34:09,605 DEV : loss 0.1093878448009491 - f1-score (micro avg) 0.8056
149
+ 2023-10-14 08:34:09,627 saving best model
150
+ 2023-10-14 08:34:10,174 ----------------------------------------------------------------------------------------------------
151
+ 2023-10-14 08:34:16,234 epoch 6 - iter 72/723 - loss 0.02664406 - time (sec): 6.06 - samples/sec: 2967.43 - lr: 0.000016 - momentum: 0.000000
152
+ 2023-10-14 08:34:22,424 epoch 6 - iter 144/723 - loss 0.02610538 - time (sec): 12.25 - samples/sec: 2951.86 - lr: 0.000016 - momentum: 0.000000
153
+ 2023-10-14 08:34:28,422 epoch 6 - iter 216/723 - loss 0.02632130 - time (sec): 18.25 - samples/sec: 2958.67 - lr: 0.000016 - momentum: 0.000000
154
+ 2023-10-14 08:34:34,905 epoch 6 - iter 288/723 - loss 0.02918746 - time (sec): 24.73 - samples/sec: 2906.90 - lr: 0.000015 - momentum: 0.000000
155
+ 2023-10-14 08:34:40,499 epoch 6 - iter 360/723 - loss 0.02786739 - time (sec): 30.32 - samples/sec: 2923.14 - lr: 0.000015 - momentum: 0.000000
156
+ 2023-10-14 08:34:46,405 epoch 6 - iter 432/723 - loss 0.02591985 - time (sec): 36.23 - samples/sec: 2914.88 - lr: 0.000015 - momentum: 0.000000
157
+ 2023-10-14 08:34:52,512 epoch 6 - iter 504/723 - loss 0.02585531 - time (sec): 42.34 - samples/sec: 2913.63 - lr: 0.000014 - momentum: 0.000000
158
+ 2023-10-14 08:34:58,755 epoch 6 - iter 576/723 - loss 0.02673939 - time (sec): 48.58 - samples/sec: 2916.01 - lr: 0.000014 - momentum: 0.000000
159
+ 2023-10-14 08:35:04,619 epoch 6 - iter 648/723 - loss 0.02626021 - time (sec): 54.44 - samples/sec: 2908.49 - lr: 0.000014 - momentum: 0.000000
160
+ 2023-10-14 08:35:10,343 epoch 6 - iter 720/723 - loss 0.02645765 - time (sec): 60.17 - samples/sec: 2919.78 - lr: 0.000013 - momentum: 0.000000
161
+ 2023-10-14 08:35:10,568 ----------------------------------------------------------------------------------------------------
162
+ 2023-10-14 08:35:10,568 EPOCH 6 done: loss 0.0264 - lr: 0.000013
163
+ 2023-10-14 08:35:14,104 DEV : loss 0.1391119509935379 - f1-score (micro avg) 0.7855
164
+ 2023-10-14 08:35:14,124 ----------------------------------------------------------------------------------------------------
165
+ 2023-10-14 08:35:20,351 epoch 7 - iter 72/723 - loss 0.01350535 - time (sec): 6.23 - samples/sec: 2796.08 - lr: 0.000013 - momentum: 0.000000
166
+ 2023-10-14 08:35:26,465 epoch 7 - iter 144/723 - loss 0.01712040 - time (sec): 12.34 - samples/sec: 2766.13 - lr: 0.000013 - momentum: 0.000000
167
+ 2023-10-14 08:35:33,034 epoch 7 - iter 216/723 - loss 0.01751483 - time (sec): 18.91 - samples/sec: 2774.49 - lr: 0.000012 - momentum: 0.000000
168
+ 2023-10-14 08:35:39,319 epoch 7 - iter 288/723 - loss 0.01623115 - time (sec): 25.19 - samples/sec: 2798.65 - lr: 0.000012 - momentum: 0.000000
169
+ 2023-10-14 08:35:45,189 epoch 7 - iter 360/723 - loss 0.01745382 - time (sec): 31.06 - samples/sec: 2833.38 - lr: 0.000012 - momentum: 0.000000
170
+ 2023-10-14 08:35:51,313 epoch 7 - iter 432/723 - loss 0.01897664 - time (sec): 37.19 - samples/sec: 2863.45 - lr: 0.000011 - momentum: 0.000000
171
+ 2023-10-14 08:35:56,993 epoch 7 - iter 504/723 - loss 0.01893676 - time (sec): 42.87 - samples/sec: 2874.41 - lr: 0.000011 - momentum: 0.000000
172
+ 2023-10-14 08:36:02,983 epoch 7 - iter 576/723 - loss 0.01873928 - time (sec): 48.86 - samples/sec: 2894.89 - lr: 0.000011 - momentum: 0.000000
173
+ 2023-10-14 08:36:08,710 epoch 7 - iter 648/723 - loss 0.01874669 - time (sec): 54.59 - samples/sec: 2894.98 - lr: 0.000010 - momentum: 0.000000
174
+ 2023-10-14 08:36:14,582 epoch 7 - iter 720/723 - loss 0.01849634 - time (sec): 60.46 - samples/sec: 2906.14 - lr: 0.000010 - momentum: 0.000000
175
+ 2023-10-14 08:36:14,763 ----------------------------------------------------------------------------------------------------
176
+ 2023-10-14 08:36:14,763 EPOCH 7 done: loss 0.0185 - lr: 0.000010
177
+ 2023-10-14 08:36:18,275 DEV : loss 0.17075838148593903 - f1-score (micro avg) 0.8055
178
+ 2023-10-14 08:36:18,295 ----------------------------------------------------------------------------------------------------
179
+ 2023-10-14 08:36:24,144 epoch 8 - iter 72/723 - loss 0.01760211 - time (sec): 5.85 - samples/sec: 2982.67 - lr: 0.000010 - momentum: 0.000000
180
+ 2023-10-14 08:36:30,981 epoch 8 - iter 144/723 - loss 0.01623072 - time (sec): 12.68 - samples/sec: 2781.67 - lr: 0.000009 - momentum: 0.000000
181
+ 2023-10-14 08:36:36,973 epoch 8 - iter 216/723 - loss 0.01533673 - time (sec): 18.68 - samples/sec: 2833.48 - lr: 0.000009 - momentum: 0.000000
182
+ 2023-10-14 08:36:42,799 epoch 8 - iter 288/723 - loss 0.01601558 - time (sec): 24.50 - samples/sec: 2864.13 - lr: 0.000009 - momentum: 0.000000
183
+ 2023-10-14 08:36:49,045 epoch 8 - iter 360/723 - loss 0.01466925 - time (sec): 30.75 - samples/sec: 2897.17 - lr: 0.000008 - momentum: 0.000000
184
+ 2023-10-14 08:36:54,836 epoch 8 - iter 432/723 - loss 0.01402939 - time (sec): 36.54 - samples/sec: 2898.47 - lr: 0.000008 - momentum: 0.000000
185
+ 2023-10-14 08:37:00,513 epoch 8 - iter 504/723 - loss 0.01405464 - time (sec): 42.22 - samples/sec: 2926.51 - lr: 0.000008 - momentum: 0.000000
186
+ 2023-10-14 08:37:06,081 epoch 8 - iter 576/723 - loss 0.01491234 - time (sec): 47.78 - samples/sec: 2932.12 - lr: 0.000007 - momentum: 0.000000
187
+ 2023-10-14 08:37:12,626 epoch 8 - iter 648/723 - loss 0.01502329 - time (sec): 54.33 - samples/sec: 2916.30 - lr: 0.000007 - momentum: 0.000000
188
+ 2023-10-14 08:37:18,676 epoch 8 - iter 720/723 - loss 0.01521053 - time (sec): 60.38 - samples/sec: 2912.50 - lr: 0.000007 - momentum: 0.000000
189
+ 2023-10-14 08:37:18,861 ----------------------------------------------------------------------------------------------------
190
+ 2023-10-14 08:37:18,861 EPOCH 8 done: loss 0.0152 - lr: 0.000007
191
+ 2023-10-14 08:37:23,265 DEV : loss 0.17406047880649567 - f1-score (micro avg) 0.7968
192
+ 2023-10-14 08:37:23,286 ----------------------------------------------------------------------------------------------------
193
+ 2023-10-14 08:37:29,454 epoch 9 - iter 72/723 - loss 0.01059210 - time (sec): 6.17 - samples/sec: 2956.17 - lr: 0.000006 - momentum: 0.000000
194
+ 2023-10-14 08:37:36,031 epoch 9 - iter 144/723 - loss 0.01193666 - time (sec): 12.74 - samples/sec: 2885.31 - lr: 0.000006 - momentum: 0.000000
195
+ 2023-10-14 08:37:42,240 epoch 9 - iter 216/723 - loss 0.01045503 - time (sec): 18.95 - samples/sec: 2951.17 - lr: 0.000006 - momentum: 0.000000
196
+ 2023-10-14 08:37:47,987 epoch 9 - iter 288/723 - loss 0.00976298 - time (sec): 24.70 - samples/sec: 2920.37 - lr: 0.000005 - momentum: 0.000000
197
+ 2023-10-14 08:37:54,314 epoch 9 - iter 360/723 - loss 0.01051185 - time (sec): 31.03 - samples/sec: 2926.68 - lr: 0.000005 - momentum: 0.000000
198
+ 2023-10-14 08:37:59,729 epoch 9 - iter 432/723 - loss 0.01007391 - time (sec): 36.44 - samples/sec: 2939.41 - lr: 0.000005 - momentum: 0.000000
199
+ 2023-10-14 08:38:05,709 epoch 9 - iter 504/723 - loss 0.01053006 - time (sec): 42.42 - samples/sec: 2926.20 - lr: 0.000004 - momentum: 0.000000
200
+ 2023-10-14 08:38:11,121 epoch 9 - iter 576/723 - loss 0.01019871 - time (sec): 47.83 - samples/sec: 2929.84 - lr: 0.000004 - momentum: 0.000000
201
+ 2023-10-14 08:38:17,094 epoch 9 - iter 648/723 - loss 0.01035051 - time (sec): 53.81 - samples/sec: 2927.27 - lr: 0.000004 - momentum: 0.000000
202
+ 2023-10-14 08:38:23,337 epoch 9 - iter 720/723 - loss 0.01068224 - time (sec): 60.05 - samples/sec: 2925.47 - lr: 0.000003 - momentum: 0.000000
203
+ 2023-10-14 08:38:23,534 ----------------------------------------------------------------------------------------------------
204
+ 2023-10-14 08:38:23,534 EPOCH 9 done: loss 0.0107 - lr: 0.000003
205
+ 2023-10-14 08:38:27,028 DEV : loss 0.1967579573392868 - f1-score (micro avg) 0.7972
206
+ 2023-10-14 08:38:27,047 ----------------------------------------------------------------------------------------------------
207
+ 2023-10-14 08:38:33,194 epoch 10 - iter 72/723 - loss 0.00260707 - time (sec): 6.15 - samples/sec: 2990.09 - lr: 0.000003 - momentum: 0.000000
208
+ 2023-10-14 08:38:38,655 epoch 10 - iter 144/723 - loss 0.00602904 - time (sec): 11.61 - samples/sec: 2988.23 - lr: 0.000003 - momentum: 0.000000
209
+ 2023-10-14 08:38:44,750 epoch 10 - iter 216/723 - loss 0.01005193 - time (sec): 17.70 - samples/sec: 2974.13 - lr: 0.000002 - momentum: 0.000000
210
+ 2023-10-14 08:38:51,428 epoch 10 - iter 288/723 - loss 0.00877443 - time (sec): 24.38 - samples/sec: 2908.63 - lr: 0.000002 - momentum: 0.000000
211
+ 2023-10-14 08:38:56,998 epoch 10 - iter 360/723 - loss 0.00837203 - time (sec): 29.95 - samples/sec: 2933.31 - lr: 0.000002 - momentum: 0.000000
212
+ 2023-10-14 08:39:03,575 epoch 10 - iter 432/723 - loss 0.00824030 - time (sec): 36.53 - samples/sec: 2931.89 - lr: 0.000001 - momentum: 0.000000
213
+ 2023-10-14 08:39:09,222 epoch 10 - iter 504/723 - loss 0.00865644 - time (sec): 42.17 - samples/sec: 2942.99 - lr: 0.000001 - momentum: 0.000000
214
+ 2023-10-14 08:39:14,978 epoch 10 - iter 576/723 - loss 0.00897306 - time (sec): 47.93 - samples/sec: 2944.84 - lr: 0.000001 - momentum: 0.000000
215
+ 2023-10-14 08:39:20,723 epoch 10 - iter 648/723 - loss 0.00883926 - time (sec): 53.67 - samples/sec: 2940.14 - lr: 0.000000 - momentum: 0.000000
216
+ 2023-10-14 08:39:26,903 epoch 10 - iter 720/723 - loss 0.00871898 - time (sec): 59.85 - samples/sec: 2938.23 - lr: 0.000000 - momentum: 0.000000
217
+ 2023-10-14 08:39:27,070 ----------------------------------------------------------------------------------------------------
218
+ 2023-10-14 08:39:27,070 EPOCH 10 done: loss 0.0088 - lr: 0.000000
219
+ 2023-10-14 08:39:30,587 DEV : loss 0.2016027718782425 - f1-score (micro avg) 0.7978
220
+ 2023-10-14 08:39:30,972 ----------------------------------------------------------------------------------------------------
221
+ 2023-10-14 08:39:30,973 Loading model from best epoch ...
222
+ 2023-10-14 08:39:32,712 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
223
+ 2023-10-14 08:39:35,860
224
+ Results:
225
+ - F-score (micro) 0.7954
226
+ - F-score (macro) 0.6863
227
+ - Accuracy 0.6765
228
+
229
+ By class:
230
+ precision recall f1-score support
231
+
232
+ PER 0.7629 0.8610 0.8090 482
233
+ LOC 0.8997 0.7838 0.8378 458
234
+ ORG 0.4355 0.3913 0.4122 69
235
+
236
+ micro avg 0.7970 0.7939 0.7954 1009
237
+ macro avg 0.6994 0.6787 0.6863 1009
238
+ weighted avg 0.8026 0.7939 0.7949 1009
239
+
240
+ 2023-10-14 08:39:35,861 ----------------------------------------------------------------------------------------------------