Dmitry Chaplinsky commited on
Commit
eb80582
1 Parent(s): 2abbc46

First release

Browse files
Files changed (6) hide show
  1. README.md +33 -0
  2. best-lm.pt +3 -0
  3. flair_dictionary.pkl +3 -0
  4. loss.txt +336 -0
  5. pipeline.py +22 -0
  6. requirements.txt +1 -0
README.md CHANGED
@@ -1,3 +1,36 @@
1
  ---
 
 
 
 
 
 
2
  license: mit
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - uk
4
+ tags:
5
+ - text2text-generation
6
+ - flair
7
+ library_name: generic
8
  license: mit
9
+ metrics:
10
+ - perplexity
11
+ datasets:
12
+ - ubertext2.0
13
+ widget:
14
+ - text: "підсумував він."
15
+ - text: "Україна переможе!"
16
  ---
17
+
18
+ # Ukrainian flair embeddings (backward)
19
+
20
+ Trained for 12+ epochs on the texts from ubertext2.0 (WIP).
21
+ The characters dictionary used for training is in `flair_dictionary.pkl` file
22
+
23
+ For more information on flair embeddings see [the article](https://github.com/flairNLP/flair/blob/master/resources/docs/embeddings/FLAIR_EMBEDDINGS.md) or the paper below:
24
+
25
+
26
+ ```bibtex
27
+ @inproceedings{akbik2018coling,
28
+ title={Contextual String Embeddings for Sequence Labeling},
29
+ author={Akbik, Alan and Blythe, Duncan and Vollgraf, Roland},
30
+ booktitle = {{COLING} 2018, 27th International Conference on Computational Linguistics},
31
+ pages = {1638--1649},
32
+ year = {2018}
33
+ }
34
+ ```
35
+
36
+ Copyright: Dmytro Chaplynskyi, [lang-uk](https://lang.org.ua) project, 2022
best-lm.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5a810aa30566d93280cdb89385000d44cd6320d559710784072ade264200620a
3
+ size 22791455
flair_dictionary.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2125c32d2db5fb79676a8a6f087b19e9c3b788cb19b87073423e31e176d1fe24
3
+ size 11900
loss.txt ADDED
@@ -0,0 +1,336 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ | end of split 1 / 28 | epoch 1 | time: 3789.14s | valid loss 1.9590 | valid ppl 7.0919 | learning rate 20.0000
2
+ | end of split 2 / 28 | epoch 1 | time: 3789.55s | valid loss 1.5745 | valid ppl 4.8282 | learning rate 20.0000
3
+ | end of split 3 / 28 | epoch 1 | time: 3801.06s | valid loss 1.4277 | valid ppl 4.1690 | learning rate 20.0000
4
+ | end of split 4 / 28 | epoch 1 | time: 3796.22s | valid loss 1.3590 | valid ppl 3.8922 | learning rate 20.0000
5
+ | end of split 5 / 28 | epoch 1 | time: 3796.46s | valid loss 1.3225 | valid ppl 3.7527 | learning rate 20.0000
6
+ | end of split 6 / 28 | epoch 1 | time: 3800.42s | valid loss 1.2908 | valid ppl 3.6357 | learning rate 20.0000
7
+ | end of split 7 / 28 | epoch 1 | time: 3795.50s | valid loss 1.2755 | valid ppl 3.5803 | learning rate 20.0000
8
+ | end of split 8 / 28 | epoch 1 | time: 3796.83s | valid loss 1.2515 | valid ppl 3.4956 | learning rate 20.0000
9
+ | end of split 9 / 28 | epoch 1 | time: 3795.35s | valid loss 1.2422 | valid ppl 3.4631 | learning rate 20.0000
10
+ | end of split 10 / 28 | epoch 1 | time: 3797.17s | valid loss 1.2255 | valid ppl 3.4059 | learning rate 20.0000
11
+ | end of split 11 / 28 | epoch 1 | time: 3792.19s | valid loss 1.2145 | valid ppl 3.3686 | learning rate 20.0000
12
+ | end of split 12 / 28 | epoch 1 | time: 3789.43s | valid loss 1.2078 | valid ppl 3.3463 | learning rate 20.0000
13
+ | end of split 13 / 28 | epoch 1 | time: 36736.65s | valid loss 1.1987 | valid ppl 3.3159 | learning rate 20.0000
14
+ | end of split 14 / 28 | epoch 1 | time: 3787.94s | valid loss 1.1954 | valid ppl 3.3047 | learning rate 20.0000
15
+ | end of split 15 / 28 | epoch 1 | time: 3809.75s | valid loss 1.1862 | valid ppl 3.2745 | learning rate 20.0000
16
+ | end of split 16 / 28 | epoch 1 | time: 3844.97s | valid loss 1.1829 | valid ppl 3.2637 | learning rate 20.0000
17
+ | end of split 17 / 28 | epoch 1 | time: 3843.82s | valid loss 1.1774 | valid ppl 3.2460 | learning rate 20.0000
18
+ | end of split 18 / 28 | epoch 1 | time: 3846.40s | valid loss 1.1728 | valid ppl 3.2310 | learning rate 20.0000
19
+ | end of split 19 / 28 | epoch 1 | time: 3844.98s | valid loss 1.1681 | valid ppl 3.2159 | learning rate 20.0000
20
+ | end of split 20 / 28 | epoch 1 | time: 3815.00s | valid loss 1.1632 | valid ppl 3.2000 | learning rate 20.0000
21
+ | end of split 21 / 28 | epoch 1 | time: 3794.38s | valid loss 1.1613 | valid ppl 3.1939 | learning rate 20.0000
22
+ | end of split 22 / 28 | epoch 1 | time: 3796.78s | valid loss 1.1564 | valid ppl 3.1786 | learning rate 20.0000
23
+ | end of split 23 / 28 | epoch 1 | time: 3797.39s | valid loss 1.1545 | valid ppl 3.1725 | learning rate 20.0000
24
+ | end of split 24 / 28 | epoch 1 | time: 3797.94s | valid loss 1.1518 | valid ppl 3.1640 | learning rate 20.0000
25
+ | end of split 25 / 28 | epoch 1 | time: 3796.01s | valid loss 1.1469 | valid ppl 3.1485 | learning rate 20.0000
26
+ | end of split 26 / 28 | epoch 1 | time: 3796.73s | valid loss 1.1459 | valid ppl 3.1451 | learning rate 20.0000
27
+ | end of split 27 / 28 | epoch 1 | time: 3796.46s | valid loss 1.1429 | valid ppl 3.1358 | learning rate 20.0000
28
+ | end of split 28 / 28 | epoch 1 | time: 1096.56s | valid loss 1.1447 | valid ppl 3.1414 | learning rate 20.0000
29
+ | end of split 1 / 28 | epoch 2 | time: 3793.96s | valid loss 1.1414 | valid ppl 3.1312 | learning rate 20.0000
30
+ | end of split 2 / 28 | epoch 2 | time: 1096.67s | valid loss 1.1419 | valid ppl 3.1329 | learning rate 20.0000
31
+ | end of split 3 / 28 | epoch 2 | time: 3796.47s | valid loss 1.1401 | valid ppl 3.1269 | learning rate 20.0000
32
+ | end of split 4 / 28 | epoch 2 | time: 3798.81s | valid loss 1.1371 | valid ppl 3.1176 | learning rate 20.0000
33
+ | end of split 5 / 28 | epoch 2 | time: 3797.67s | valid loss 1.1361 | valid ppl 3.1146 | learning rate 20.0000
34
+ | end of split 6 / 28 | epoch 2 | time: 3798.63s | valid loss 1.1336 | valid ppl 3.1067 | learning rate 20.0000
35
+ | end of split 7 / 28 | epoch 2 | time: 3791.11s | valid loss 1.1323 | valid ppl 3.1028 | learning rate 20.0000
36
+ | end of split 8 / 28 | epoch 2 | time: 3788.66s | valid loss 1.1296 | valid ppl 3.0944 | learning rate 20.0000
37
+ | end of split 9 / 28 | epoch 2 | time: 3797.21s | valid loss 1.1272 | valid ppl 3.0869 | learning rate 20.0000
38
+ | end of split 10 / 28 | epoch 2 | time: 3794.19s | valid loss 1.1253 | valid ppl 3.0810 | learning rate 20.0000
39
+ | end of split 11 / 28 | epoch 2 | time: 3797.66s | valid loss 1.1238 | valid ppl 3.0765 | learning rate 20.0000
40
+ | end of split 12 / 28 | epoch 2 | time: 3795.30s | valid loss 1.1242 | valid ppl 3.0777 | learning rate 20.0000
41
+ | end of split 13 / 28 | epoch 2 | time: 3799.97s | valid loss 1.1220 | valid ppl 3.0710 | learning rate 20.0000
42
+ | end of split 14 / 28 | epoch 2 | time: 3798.40s | valid loss 1.1198 | valid ppl 3.0644 | learning rate 20.0000
43
+ | end of split 15 / 28 | epoch 2 | time: 3800.94s | valid loss 1.1200 | valid ppl 3.0650 | learning rate 20.0000
44
+ | end of split 16 / 28 | epoch 2 | time: 3795.23s | valid loss 1.1184 | valid ppl 3.0600 | learning rate 20.0000
45
+ | end of split 17 / 28 | epoch 2 | time: 3797.60s | valid loss 1.1181 | valid ppl 3.0591 | learning rate 20.0000
46
+ | end of split 18 / 28 | epoch 2 | time: 3794.23s | valid loss 1.1155 | valid ppl 3.0512 | learning rate 20.0000
47
+ | end of split 19 / 28 | epoch 2 | time: 3794.97s | valid loss 1.1144 | valid ppl 3.0477 | learning rate 20.0000
48
+ | end of split 20 / 28 | epoch 2 | time: 3801.57s | valid loss 1.1144 | valid ppl 3.0476 | learning rate 20.0000
49
+ | end of split 21 / 28 | epoch 2 | time: 3797.96s | valid loss 1.1128 | valid ppl 3.0428 | learning rate 20.0000
50
+ | end of split 22 / 28 | epoch 2 | time: 3797.43s | valid loss 1.1112 | valid ppl 3.0381 | learning rate 20.0000
51
+ | end of split 23 / 28 | epoch 2 | time: 3794.87s | valid loss 1.1099 | valid ppl 3.0342 | learning rate 20.0000
52
+ | end of split 24 / 28 | epoch 2 | time: 3799.90s | valid loss 1.1100 | valid ppl 3.0344 | learning rate 20.0000
53
+ | end of split 25 / 28 | epoch 2 | time: 3802.10s | valid loss 1.1083 | valid ppl 3.0291 | learning rate 20.0000
54
+ | end of split 26 / 28 | epoch 2 | time: 3800.69s | valid loss 1.1076 | valid ppl 3.0270 | learning rate 20.0000
55
+ | end of split 27 / 28 | epoch 2 | time: 3796.47s | valid loss 1.1065 | valid ppl 3.0238 | learning rate 20.0000
56
+ | end of split 28 / 28 | epoch 2 | time: 3801.18s | valid loss 1.1051 | valid ppl 3.0196 | learning rate 20.0000
57
+ | end of split 1 / 28 | epoch 3 | time: 3796.57s | valid loss 1.1045 | valid ppl 3.0176 | learning rate 20.0000
58
+ | end of split 2 / 28 | epoch 3 | time: 3801.61s | valid loss 1.1035 | valid ppl 3.0146 | learning rate 20.0000
59
+ | end of split 3 / 28 | epoch 3 | time: 3800.25s | valid loss 1.1027 | valid ppl 3.0122 | learning rate 20.0000
60
+ | end of split 4 / 28 | epoch 3 | time: 3800.72s | valid loss 1.1013 | valid ppl 3.0080 | learning rate 20.0000
61
+ | end of split 5 / 28 | epoch 3 | time: 3802.82s | valid loss 1.1010 | valid ppl 3.0072 | learning rate 20.0000
62
+ | end of split 6 / 28 | epoch 3 | time: 3802.42s | valid loss 1.1003 | valid ppl 3.0052 | learning rate 20.0000
63
+ | end of split 7 / 28 | epoch 3 | time: 3798.84s | valid loss 1.1001 | valid ppl 3.0044 | learning rate 20.0000
64
+ | end of split 8 / 28 | epoch 3 | time: 3793.80s | valid loss 1.1002 | valid ppl 3.0046 | learning rate 20.0000
65
+ | end of split 9 / 28 | epoch 3 | time: 3797.24s | valid loss 1.0987 | valid ppl 3.0002 | learning rate 20.0000
66
+ | end of split 10 / 28 | epoch 3 | time: 3795.35s | valid loss 1.0976 | valid ppl 2.9969 | learning rate 20.0000
67
+ | end of split 11 / 28 | epoch 3 | time: 3796.91s | valid loss 1.0978 | valid ppl 2.9976 | learning rate 20.0000
68
+ | end of split 12 / 28 | epoch 3 | time: 3797.71s | valid loss 1.0973 | valid ppl 2.9962 | learning rate 20.0000
69
+ | end of split 13 / 28 | epoch 3 | time: 3795.99s | valid loss 1.0967 | valid ppl 2.9943 | learning rate 20.0000
70
+ | end of split 14 / 28 | epoch 3 | time: 3795.07s | valid loss 1.0957 | valid ppl 2.9913 | learning rate 20.0000
71
+ | end of split 15 / 28 | epoch 3 | time: 3793.25s | valid loss 1.0942 | valid ppl 2.9869 | learning rate 20.0000
72
+ | end of split 16 / 28 | epoch 3 | time: 3797.79s | valid loss 1.0940 | valid ppl 2.9863 | learning rate 20.0000
73
+ | end of split 17 / 28 | epoch 3 | time: 3796.74s | valid loss 1.0934 | valid ppl 2.9844 | learning rate 20.0000
74
+ | end of split 18 / 28 | epoch 3 | time: 3794.47s | valid loss 1.0924 | valid ppl 2.9815 | learning rate 20.0000
75
+ | end of split 19 / 28 | epoch 3 | time: 3794.62s | valid loss 1.0924 | valid ppl 2.9814 | learning rate 20.0000
76
+ | end of split 20 / 28 | epoch 3 | time: 3797.27s | valid loss 1.0907 | valid ppl 2.9764 | learning rate 20.0000
77
+ | end of split 21 / 28 | epoch 3 | time: 3796.49s | valid loss 1.0909 | valid ppl 2.9770 | learning rate 20.0000
78
+ | end of split 22 / 28 | epoch 3 | time: 3798.45s | valid loss 1.0913 | valid ppl 2.9783 | learning rate 20.0000
79
+ | end of split 23 / 28 | epoch 3 | time: 1098.05s | valid loss 1.0917 | valid ppl 2.9792 | learning rate 20.0000
80
+ | end of split 24 / 28 | epoch 3 | time: 3789.62s | valid loss 1.0908 | valid ppl 2.9768 | learning rate 20.0000
81
+ | end of split 25 / 28 | epoch 3 | time: 3790.60s | valid loss 1.0899 | valid ppl 2.9739 | learning rate 20.0000
82
+ | end of split 26 / 28 | epoch 3 | time: 3794.69s | valid loss 1.0878 | valid ppl 2.9677 | learning rate 20.0000
83
+ | end of split 27 / 28 | epoch 3 | time: 3789.68s | valid loss 1.0886 | valid ppl 2.9702 | learning rate 20.0000
84
+ | end of split 28 / 28 | epoch 3 | time: 3798.26s | valid loss 1.0890 | valid ppl 2.9712 | learning rate 20.0000
85
+ | end of split 1 / 28 | epoch 4 | time: 3791.05s | valid loss 1.0875 | valid ppl 2.9668 | learning rate 20.0000
86
+ | end of split 2 / 28 | epoch 4 | time: 3801.11s | valid loss 1.0872 | valid ppl 2.9658 | learning rate 20.0000
87
+ | end of split 3 / 28 | epoch 4 | time: 3799.85s | valid loss 1.0874 | valid ppl 2.9665 | learning rate 20.0000
88
+ | end of split 4 / 28 | epoch 4 | time: 3798.81s | valid loss 1.0856 | valid ppl 2.9611 | learning rate 20.0000
89
+ | end of split 5 / 28 | epoch 4 | time: 3799.37s | valid loss 1.0849 | valid ppl 2.9591 | learning rate 20.0000
90
+ | end of split 6 / 28 | epoch 4 | time: 3794.42s | valid loss 1.0845 | valid ppl 2.9578 | learning rate 20.0000
91
+ | end of split 7 / 28 | epoch 4 | time: 3795.86s | valid loss 1.0865 | valid ppl 2.9639 | learning rate 20.0000
92
+ | end of split 8 / 28 | epoch 4 | time: 3796.29s | valid loss 1.0845 | valid ppl 2.9580 | learning rate 20.0000
93
+ | end of split 9 / 28 | epoch 4 | time: 3799.07s | valid loss 1.0838 | valid ppl 2.9560 | learning rate 20.0000
94
+ | end of split 10 / 28 | epoch 4 | time: 3798.77s | valid loss 1.0856 | valid ppl 2.9612 | learning rate 20.0000
95
+ | end of split 11 / 28 | epoch 4 | time: 3795.42s | valid loss 1.0826 | valid ppl 2.9524 | learning rate 20.0000
96
+ | end of split 12 / 28 | epoch 4 | time: 3798.31s | valid loss 1.0829 | valid ppl 2.9533 | learning rate 20.0000
97
+ | end of split 13 / 28 | epoch 4 | time: 1097.39s | valid loss 1.0828 | valid ppl 2.9528 | learning rate 20.0000
98
+ | end of split 14 / 28 | epoch 4 | time: 3796.62s | valid loss 1.0831 | valid ppl 2.9538 | learning rate 20.0000
99
+ | end of split 15 / 28 | epoch 4 | time: 3794.73s | valid loss 1.0821 | valid ppl 2.9508 | learning rate 20.0000
100
+ | end of split 16 / 28 | epoch 4 | time: 3797.00s | valid loss 1.0810 | valid ppl 2.9476 | learning rate 20.0000
101
+ | end of split 17 / 28 | epoch 4 | time: 3806.15s | valid loss 1.0812 | valid ppl 2.9481 | learning rate 20.0000
102
+ | end of split 18 / 28 | epoch 4 | time: 3806.71s | valid loss 1.0809 | valid ppl 2.9473 | learning rate 20.0000
103
+ | end of split 19 / 28 | epoch 4 | time: 3795.87s | valid loss 1.0813 | valid ppl 2.9484 | learning rate 20.0000
104
+ | end of split 20 / 28 | epoch 4 | time: 3799.98s | valid loss 1.0817 | valid ppl 2.9497 | learning rate 20.0000
105
+ | end of split 21 / 28 | epoch 4 | time: 3795.32s | valid loss 1.0803 | valid ppl 2.9455 | learning rate 20.0000
106
+ | end of split 22 / 28 | epoch 4 | time: 3794.34s | valid loss 1.0797 | valid ppl 2.9438 | learning rate 20.0000
107
+ | end of split 23 / 28 | epoch 4 | time: 3804.34s | valid loss 1.0790 | valid ppl 2.9417 | learning rate 20.0000
108
+ | end of split 24 / 28 | epoch 4 | time: 3798.90s | valid loss 1.0796 | valid ppl 2.9434 | learning rate 20.0000
109
+ | end of split 25 / 28 | epoch 4 | time: 3804.95s | valid loss 1.0802 | valid ppl 2.9454 | learning rate 20.0000
110
+ | end of split 26 / 28 | epoch 4 | time: 3799.98s | valid loss 1.0779 | valid ppl 2.9385 | learning rate 20.0000
111
+ | end of split 27 / 28 | epoch 4 | time: 3804.99s | valid loss 1.0798 | valid ppl 2.9441 | learning rate 20.0000
112
+ | end of split 28 / 28 | epoch 4 | time: 3804.92s | valid loss 1.0784 | valid ppl 2.9399 | learning rate 20.0000
113
+ | end of split 1 / 28 | epoch 5 | time: 3793.19s | valid loss 1.0781 | valid ppl 2.9390 | learning rate 20.0000
114
+ | end of split 2 / 28 | epoch 5 | time: 3794.63s | valid loss 1.0771 | valid ppl 2.9363 | learning rate 20.0000
115
+ | end of split 3 / 28 | epoch 5 | time: 3797.63s | valid loss 1.0761 | valid ppl 2.9333 | learning rate 20.0000
116
+ | end of split 4 / 28 | epoch 5 | time: 3797.24s | valid loss 1.0752 | valid ppl 2.9305 | learning rate 20.0000
117
+ | end of split 5 / 28 | epoch 5 | time: 3835.87s | valid loss 1.0764 | valid ppl 2.9340 | learning rate 20.0000
118
+ | end of split 6 / 28 | epoch 5 | time: 3836.48s | valid loss 1.0759 | valid ppl 2.9327 | learning rate 20.0000
119
+ | end of split 7 / 28 | epoch 5 | time: 3804.72s | valid loss 1.0756 | valid ppl 2.9319 | learning rate 20.0000
120
+ | end of split 8 / 28 | epoch 5 | time: 3797.48s | valid loss 1.0757 | valid ppl 2.9321 | learning rate 20.0000
121
+ | end of split 9 / 28 | epoch 5 | time: 3800.06s | valid loss 1.0751 | valid ppl 2.9303 | learning rate 20.0000
122
+ | end of split 10 / 28 | epoch 5 | time: 3796.96s | valid loss 1.0766 | valid ppl 2.9346 | learning rate 20.0000
123
+ | end of split 11 / 28 | epoch 5 | time: 3796.87s | valid loss 1.0751 | valid ppl 2.9303 | learning rate 20.0000
124
+ | end of split 12 / 28 | epoch 5 | time: 3794.98s | valid loss 1.0740 | valid ppl 2.9270 | learning rate 20.0000
125
+ | end of split 13 / 28 | epoch 5 | time: 3794.18s | valid loss 1.0737 | valid ppl 2.9261 | learning rate 20.0000
126
+ | end of split 14 / 28 | epoch 5 | time: 3794.87s | valid loss 1.0749 | valid ppl 2.9296 | learning rate 20.0000
127
+ | end of split 15 / 28 | epoch 5 | time: 3794.59s | valid loss 1.0737 | valid ppl 2.9263 | learning rate 20.0000
128
+ | end of split 16 / 28 | epoch 5 | time: 3798.73s | valid loss 1.0746 | valid ppl 2.9288 | learning rate 20.0000
129
+ | end of split 17 / 28 | epoch 5 | time: 3799.97s | valid loss 1.0912 | valid ppl 2.9777 | learning rate 20.0000
130
+ | end of split 18 / 28 | epoch 5 | time: 1097.48s | valid loss 1.0744 | valid ppl 2.9284 | learning rate 20.0000
131
+ | end of split 19 / 28 | epoch 5 | time: 3800.18s | valid loss 1.0725 | valid ppl 2.9227 | learning rate 20.0000
132
+ | end of split 20 / 28 | epoch 5 | time: 3801.07s | valid loss 1.0746 | valid ppl 2.9288 | learning rate 20.0000
133
+ | end of split 21 / 28 | epoch 5 | time: 3803.87s | valid loss 1.0742 | valid ppl 2.9277 | learning rate 20.0000
134
+ | end of split 22 / 28 | epoch 5 | time: 3807.38s | valid loss 1.0745 | valid ppl 2.9286 | learning rate 20.0000
135
+ | end of split 23 / 28 | epoch 5 | time: 3802.41s | valid loss 1.0735 | valid ppl 2.9255 | learning rate 20.0000
136
+ | end of split 24 / 28 | epoch 5 | time: 3803.85s | valid loss 1.0714 | valid ppl 2.9193 | learning rate 20.0000
137
+ | end of split 25 / 28 | epoch 5 | time: 3802.20s | valid loss 1.0703 | valid ppl 2.9163 | learning rate 20.0000
138
+ | end of split 26 / 28 | epoch 5 | time: 3804.97s | valid loss 1.0696 | valid ppl 2.9142 | learning rate 20.0000
139
+ | end of split 27 / 28 | epoch 5 | time: 3805.82s | valid loss 1.0704 | valid ppl 2.9167 | learning rate 20.0000
140
+ | end of split 28 / 28 | epoch 5 | time: 3804.59s | valid loss 1.0692 | valid ppl 2.9130 | learning rate 20.0000
141
+ | end of split 1 / 28 | epoch 6 | time: 3798.75s | valid loss 1.0703 | valid ppl 2.9162 | learning rate 20.0000
142
+ | end of split 2 / 28 | epoch 6 | time: 3801.06s | valid loss 1.0702 | valid ppl 2.9159 | learning rate 20.0000
143
+ | end of split 3 / 28 | epoch 6 | time: 3796.51s | valid loss 1.0690 | valid ppl 2.9123 | learning rate 20.0000
144
+ | end of split 4 / 28 | epoch 6 | time: 3797.49s | valid loss 1.0686 | valid ppl 2.9114 | learning rate 20.0000
145
+ | end of split 5 / 28 | epoch 6 | time: 3802.58s | valid loss 1.0688 | valid ppl 2.9120 | learning rate 20.0000
146
+ | end of split 6 / 28 | epoch 6 | time: 3800.26s | valid loss 1.0689 | valid ppl 2.9121 | learning rate 20.0000
147
+ | end of split 7 / 28 | epoch 6 | time: 3801.18s | valid loss 1.0683 | valid ppl 2.9103 | learning rate 20.0000
148
+ | end of split 8 / 28 | epoch 6 | time: 3805.98s | valid loss 1.0674 | valid ppl 2.9079 | learning rate 20.0000
149
+ | end of split 9 / 28 | epoch 6 | time: 3804.26s | valid loss 1.0674 | valid ppl 2.9078 | learning rate 20.0000
150
+ | end of split 10 / 28 | epoch 6 | time: 3797.98s | valid loss 1.0696 | valid ppl 2.9143 | learning rate 20.0000
151
+ | end of split 11 / 28 | epoch 6 | time: 3801.56s | valid loss 1.0679 | valid ppl 2.9093 | learning rate 20.0000
152
+ | end of split 12 / 28 | epoch 6 | time: 3802.48s | valid loss 1.0672 | valid ppl 2.9074 | learning rate 20.0000
153
+ | end of split 13 / 28 | epoch 6 | time: 3812.54s | valid loss 1.0673 | valid ppl 2.9076 | learning rate 20.0000
154
+ | end of split 14 / 28 | epoch 6 | time: 3816.47s | valid loss 1.0680 | valid ppl 2.9094 | learning rate 20.0000
155
+ | end of split 15 / 28 | epoch 6 | time: 3808.34s | valid loss 1.0670 | valid ppl 2.9067 | learning rate 20.0000
156
+ | end of split 16 / 28 | epoch 6 | time: 3810.71s | valid loss 1.0668 | valid ppl 2.9062 | learning rate 20.0000
157
+ | end of split 17 / 28 | epoch 6 | time: 3811.31s | valid loss 1.0657 | valid ppl 2.9028 | learning rate 20.0000
158
+ | end of split 18 / 28 | epoch 6 | time: 3808.51s | valid loss 1.0663 | valid ppl 2.9046 | learning rate 20.0000
159
+ | end of split 19 / 28 | epoch 6 | time: 3806.94s | valid loss 1.0660 | valid ppl 2.9039 | learning rate 20.0000
160
+ | end of split 20 / 28 | epoch 6 | time: 3804.47s | valid loss 1.0658 | valid ppl 2.9031 | learning rate 20.0000
161
+ | end of split 21 / 28 | epoch 6 | time: 3803.28s | valid loss 1.0657 | valid ppl 2.9029 | learning rate 20.0000
162
+ | end of split 22 / 28 | epoch 6 | time: 1098.89s | valid loss 1.0650 | valid ppl 2.9009 | learning rate 20.0000
163
+ | end of split 23 / 28 | epoch 6 | time: 3801.72s | valid loss 1.0658 | valid ppl 2.9030 | learning rate 20.0000
164
+ | end of split 24 / 28 | epoch 6 | time: 3808.12s | valid loss 1.0656 | valid ppl 2.9025 | learning rate 20.0000
165
+ | end of split 25 / 28 | epoch 6 | time: 3806.53s | valid loss 1.0679 | valid ppl 2.9094 | learning rate 20.0000
166
+ | end of split 26 / 28 | epoch 6 | time: 3800.71s | valid loss 1.0656 | valid ppl 2.9026 | learning rate 20.0000
167
+ | end of split 27 / 28 | epoch 6 | time: 3802.33s | valid loss 1.0645 | valid ppl 2.8994 | learning rate 20.0000
168
+ | end of split 28 / 28 | epoch 6 | time: 3797.75s | valid loss 1.0645 | valid ppl 2.8994 | learning rate 20.0000
169
+ | end of split 1 / 28 | epoch 7 | time: 3800.93s | valid loss 1.0649 | valid ppl 2.9004 | learning rate 20.0000
170
+ | end of split 2 / 28 | epoch 7 | time: 3803.64s | valid loss 1.0637 | valid ppl 2.8969 | learning rate 20.0000
171
+ | end of split 3 / 28 | epoch 7 | time: 3803.79s | valid loss 1.0636 | valid ppl 2.8968 | learning rate 20.0000
172
+ | end of split 4 / 28 | epoch 7 | time: 3805.63s | valid loss 1.0641 | valid ppl 2.8983 | learning rate 20.0000
173
+ | end of split 5 / 28 | epoch 7 | time: 3795.80s | valid loss 1.0629 | valid ppl 2.8947 | learning rate 20.0000
174
+ | end of split 6 / 28 | epoch 7 | time: 3807.54s | valid loss 1.0630 | valid ppl 2.8950 | learning rate 20.0000
175
+ | end of split 7 / 28 | epoch 7 | time: 3804.15s | valid loss 1.0640 | valid ppl 2.8980 | learning rate 20.0000
176
+ | end of split 8 / 28 | epoch 7 | time: 3803.94s | valid loss 1.0637 | valid ppl 2.8972 | learning rate 20.0000
177
+ | end of split 9 / 28 | epoch 7 | time: 3803.38s | valid loss 1.0634 | valid ppl 2.8962 | learning rate 20.0000
178
+ | end of split 10 / 28 | epoch 7 | time: 3806.34s | valid loss 1.0650 | valid ppl 2.9008 | learning rate 20.0000
179
+ | end of split 11 / 28 | epoch 7 | time: 1098.92s | valid loss 1.0622 | valid ppl 2.8926 | learning rate 20.0000
180
+ | end of split 12 / 28 | epoch 7 | time: 3803.81s | valid loss 1.0622 | valid ppl 2.8926 | learning rate 20.0000
181
+ | end of split 13 / 28 | epoch 7 | time: 3806.59s | valid loss 1.0630 | valid ppl 2.8949 | learning rate 20.0000
182
+ | end of split 14 / 28 | epoch 7 | time: 3803.04s | valid loss 1.0620 | valid ppl 2.8920 | learning rate 20.0000
183
+ | end of split 15 / 28 | epoch 7 | time: 3803.29s | valid loss 1.0619 | valid ppl 2.8920 | learning rate 20.0000
184
+ | end of split 16 / 28 | epoch 7 | time: 3802.60s | valid loss 1.0630 | valid ppl 2.8950 | learning rate 20.0000
185
+ | end of split 17 / 28 | epoch 7 | time: 3805.28s | valid loss 1.0621 | valid ppl 2.8925 | learning rate 20.0000
186
+ | end of split 18 / 28 | epoch 7 | time: 3800.72s | valid loss 1.0616 | valid ppl 2.8910 | learning rate 20.0000
187
+ | end of split 19 / 28 | epoch 7 | time: 3801.59s | valid loss 1.0615 | valid ppl 2.8907 | learning rate 20.0000
188
+ | end of split 20 / 28 | epoch 7 | time: 3803.04s | valid loss 1.0610 | valid ppl 2.8892 | learning rate 20.0000
189
+ | end of split 21 / 28 | epoch 7 | time: 3809.57s | valid loss 1.0597 | valid ppl 2.8855 | learning rate 20.0000
190
+ | end of split 22 / 28 | epoch 7 | time: 3802.88s | valid loss 1.0621 | valid ppl 2.8923 | learning rate 20.0000
191
+ | end of split 23 / 28 | epoch 7 | time: 3799.92s | valid loss 1.0612 | valid ppl 2.8900 | learning rate 20.0000
192
+ | end of split 24 / 28 | epoch 7 | time: 3804.46s | valid loss 1.0615 | valid ppl 2.8907 | learning rate 20.0000
193
+ | end of split 25 / 28 | epoch 7 | time: 3798.64s | valid loss 1.0599 | valid ppl 2.8862 | learning rate 20.0000
194
+ | end of split 26 / 28 | epoch 7 | time: 3799.12s | valid loss 1.0603 | valid ppl 2.8873 | learning rate 20.0000
195
+ | end of split 27 / 28 | epoch 7 | time: 3798.12s | valid loss 1.0606 | valid ppl 2.8880 | learning rate 20.0000
196
+ | end of split 28 / 28 | epoch 7 | time: 3805.05s | valid loss 1.0604 | valid ppl 2.8875 | learning rate 20.0000
197
+ | end of split 1 / 28 | epoch 8 | time: 3797.40s | valid loss 1.0600 | valid ppl 2.8863 | learning rate 20.0000
198
+ | end of split 2 / 28 | epoch 8 | time: 3796.23s | valid loss 1.0608 | valid ppl 2.8886 | learning rate 20.0000
199
+ | end of split 3 / 28 | epoch 8 | time: 3797.50s | valid loss 1.0626 | valid ppl 2.8940 | learning rate 20.0000
200
+ | end of split 4 / 28 | epoch 8 | time: 3798.81s | valid loss 1.0599 | valid ppl 2.8861 | learning rate 20.0000
201
+ | end of split 5 / 28 | epoch 8 | time: 3800.00s | valid loss 1.0562 | valid ppl 2.8756 | learning rate 5.0000
202
+ | end of split 6 / 28 | epoch 8 | time: 3806.43s | valid loss 1.0559 | valid ppl 2.8747 | learning rate 5.0000
203
+ | end of split 7 / 28 | epoch 8 | time: 3804.50s | valid loss 1.0557 | valid ppl 2.8739 | learning rate 5.0000
204
+ | end of split 8 / 28 | epoch 8 | time: 3803.18s | valid loss 1.0555 | valid ppl 2.8735 | learning rate 5.0000
205
+ | end of split 9 / 28 | epoch 8 | time: 1098.26s | valid loss 1.0555 | valid ppl 2.8734 | learning rate 5.0000
206
+ | end of split 10 / 28 | epoch 8 | time: 3803.32s | valid loss 1.0553 | valid ppl 2.8730 | learning rate 5.0000
207
+ | end of split 11 / 28 | epoch 8 | time: 3805.59s | valid loss 1.0553 | valid ppl 2.8728 | learning rate 5.0000
208
+ | end of split 12 / 28 | epoch 8 | time: 3798.28s | valid loss 1.0551 | valid ppl 2.8724 | learning rate 5.0000
209
+ | end of split 13 / 28 | epoch 8 | time: 3798.22s | valid loss 1.0551 | valid ppl 2.8722 | learning rate 5.0000
210
+ | end of split 14 / 28 | epoch 8 | time: 3798.98s | valid loss 1.0550 | valid ppl 2.8720 | learning rate 5.0000
211
+ | end of split 15 / 28 | epoch 8 | time: 3796.37s | valid loss 1.0550 | valid ppl 2.8719 | learning rate 5.0000
212
+ | end of split 16 / 28 | epoch 8 | time: 3792.33s | valid loss 1.0549 | valid ppl 2.8717 | learning rate 5.0000
213
+ | end of split 17 / 28 | epoch 8 | time: 3801.12s | valid loss 1.0548 | valid ppl 2.8715 | learning rate 5.0000
214
+ | end of split 18 / 28 | epoch 8 | time: 3803.54s | valid loss 1.0548 | valid ppl 2.8713 | learning rate 5.0000
215
+ | end of split 19 / 28 | epoch 8 | time: 3794.99s | valid loss 1.0547 | valid ppl 2.8712 | learning rate 5.0000
216
+ | end of split 20 / 28 | epoch 8 | time: 3800.67s | valid loss 1.0546 | valid ppl 2.8709 | learning rate 5.0000
217
+ | end of split 21 / 28 | epoch 8 | time: 3802.07s | valid loss 1.0547 | valid ppl 2.8710 | learning rate 5.0000
218
+ | end of split 22 / 28 | epoch 8 | time: 3795.63s | valid loss 1.0546 | valid ppl 2.8707 | learning rate 5.0000
219
+ | end of split 23 / 28 | epoch 8 | time: 3797.48s | valid loss 1.0545 | valid ppl 2.8705 | learning rate 5.0000
220
+ | end of split 24 / 28 | epoch 8 | time: 3826.24s | valid loss 1.0545 | valid ppl 2.8705 | learning rate 5.0000
221
+ | end of split 25 / 28 | epoch 8 | time: 3796.29s | valid loss 1.0543 | valid ppl 2.8701 | learning rate 5.0000
222
+ | end of split 26 / 28 | epoch 8 | time: 3803.96s | valid loss 1.0545 | valid ppl 2.8705 | learning rate 5.0000
223
+ | end of split 27 / 28 | epoch 8 | time: 3802.34s | valid loss 1.0543 | valid ppl 2.8700 | learning rate 5.0000
224
+ | end of split 28 / 28 | epoch 8 | time: 3803.96s | valid loss 1.0543 | valid ppl 2.8699 | learning rate 5.0000
225
+ | end of split 1 / 28 | epoch 9 | time: 3798.65s | valid loss 1.0542 | valid ppl 2.8697 | learning rate 5.0000
226
+ | end of split 2 / 28 | epoch 9 | time: 3801.55s | valid loss 1.0542 | valid ppl 2.8696 | learning rate 5.0000
227
+ | end of split 3 / 28 | epoch 9 | time: 3806.56s | valid loss 1.0541 | valid ppl 2.8693 | learning rate 5.0000
228
+ | end of split 4 / 28 | epoch 9 | time: 3801.41s | valid loss 1.0541 | valid ppl 2.8695 | learning rate 5.0000
229
+ | end of split 5 / 28 | epoch 9 | time: 3799.18s | valid loss 1.0540 | valid ppl 2.8692 | learning rate 5.0000
230
+ | end of split 6 / 28 | epoch 9 | time: 3801.41s | valid loss 1.0540 | valid ppl 2.8690 | learning rate 5.0000
231
+ | end of split 7 / 28 | epoch 9 | time: 3792.65s | valid loss 1.0539 | valid ppl 2.8687 | learning rate 5.0000
232
+ | end of split 8 / 28 | epoch 9 | time: 3801.50s | valid loss 1.0539 | valid ppl 2.8688 | learning rate 5.0000
233
+ | end of split 9 / 28 | epoch 9 | time: 3799.22s | valid loss 1.0539 | valid ppl 2.8689 | learning rate 5.0000
234
+ | end of split 10 / 28 | epoch 9 | time: 3798.30s | valid loss 1.0537 | valid ppl 2.8683 | learning rate 5.0000
235
+ | end of split 11 / 28 | epoch 9 | time: 3794.81s | valid loss 1.0537 | valid ppl 2.8682 | learning rate 5.0000
236
+ | end of split 12 / 28 | epoch 9 | time: 3794.04s | valid loss 1.0537 | valid ppl 2.8682 | learning rate 5.0000
237
+ | end of split 13 / 28 | epoch 9 | time: 3798.63s | valid loss 1.0537 | valid ppl 2.8683 | learning rate 5.0000
238
+ | end of split 14 / 28 | epoch 9 | time: 3797.90s | valid loss 1.0535 | valid ppl 2.8678 | learning rate 5.0000
239
+ | end of split 15 / 28 | epoch 9 | time: 3796.44s | valid loss 1.0536 | valid ppl 2.8680 | learning rate 5.0000
240
+ | end of split 16 / 28 | epoch 9 | time: 3798.41s | valid loss 1.0536 | valid ppl 2.8678 | learning rate 5.0000
241
+ | end of split 17 / 28 | epoch 9 | time: 3799.93s | valid loss 1.0535 | valid ppl 2.8676 | learning rate 5.0000
242
+ | end of split 18 / 28 | epoch 9 | time: 3803.40s | valid loss 1.0534 | valid ppl 2.8673 | learning rate 5.0000
243
+ | end of split 19 / 28 | epoch 9 | time: 3807.52s | valid loss 1.0537 | valid ppl 2.8683 | learning rate 5.0000
244
+ | end of split 20 / 28 | epoch 9 | time: 3807.58s | valid loss 1.0534 | valid ppl 2.8673 | learning rate 5.0000
245
+ | end of split 21 / 28 | epoch 9 | time: 3799.18s | valid loss 1.0533 | valid ppl 2.8672 | learning rate 5.0000
246
+ | end of split 22 / 28 | epoch 9 | time: 3800.62s | valid loss 1.0532 | valid ppl 2.8668 | learning rate 5.0000
247
+ | end of split 23 / 28 | epoch 9 | time: 3796.79s | valid loss 1.0532 | valid ppl 2.8667 | learning rate 5.0000
248
+ | end of split 24 / 28 | epoch 9 | time: 1097.06s | valid loss 1.0532 | valid ppl 2.8669 | learning rate 5.0000
249
+ | end of split 25 / 28 | epoch 9 | time: 3795.86s | valid loss 1.0532 | valid ppl 2.8669 | learning rate 5.0000
250
+ | end of split 26 / 28 | epoch 9 | time: 3803.14s | valid loss 1.0531 | valid ppl 2.8665 | learning rate 5.0000
251
+ | end of split 27 / 28 | epoch 9 | time: 3798.92s | valid loss 1.0530 | valid ppl 2.8663 | learning rate 5.0000
252
+ | end of split 28 / 28 | epoch 9 | time: 3799.90s | valid loss 1.0530 | valid ppl 2.8663 | learning rate 5.0000
253
+ | end of split 1 / 28 | epoch 10 | time: 3798.57s | valid loss 1.0530 | valid ppl 2.8662 | learning rate 5.0000
254
+ | end of split 2 / 28 | epoch 10 | time: 3798.13s | valid loss 1.0529 | valid ppl 2.8661 | learning rate 5.0000
255
+ | end of split 3 / 28 | epoch 10 | time: 3799.82s | valid loss 1.0530 | valid ppl 2.8662 | learning rate 5.0000
256
+ | end of split 4 / 28 | epoch 10 | time: 3802.23s | valid loss 1.0529 | valid ppl 2.8659 | learning rate 5.0000
257
+ | end of split 5 / 28 | epoch 10 | time: 3801.56s | valid loss 1.0529 | valid ppl 2.8660 | learning rate 5.0000
258
+ | end of split 6 / 28 | epoch 10 | time: 3798.08s | valid loss 1.0528 | valid ppl 2.8656 | learning rate 5.0000
259
+ | end of split 7 / 28 | epoch 10 | time: 3800.12s | valid loss 1.0528 | valid ppl 2.8656 | learning rate 5.0000
260
+ | end of split 8 / 28 | epoch 10 | time: 3800.94s | valid loss 1.0526 | valid ppl 2.8652 | learning rate 5.0000
261
+ | end of split 9 / 28 | epoch 10 | time: 3801.43s | valid loss 1.0529 | valid ppl 2.8659 | learning rate 5.0000
262
+ | end of split 10 / 28 | epoch 10 | time: 3798.47s | valid loss 1.0526 | valid ppl 2.8652 | learning rate 5.0000
263
+ | end of split 11 / 28 | epoch 10 | time: 3803.15s | valid loss 1.0526 | valid ppl 2.8650 | learning rate 5.0000
264
+ | end of split 12 / 28 | epoch 10 | time: 3800.32s | valid loss 1.0526 | valid ppl 2.8650 | learning rate 5.0000
265
+ | end of split 13 / 28 | epoch 10 | time: 3802.61s | valid loss 1.0525 | valid ppl 2.8647 | learning rate 5.0000
266
+ | end of split 14 / 28 | epoch 10 | time: 3799.08s | valid loss 1.0525 | valid ppl 2.8648 | learning rate 5.0000
267
+ | end of split 15 / 28 | epoch 10 | time: 3801.19s | valid loss 1.0525 | valid ppl 2.8647 | learning rate 5.0000
268
+ | end of split 16 / 28 | epoch 10 | time: 3801.20s | valid loss 1.0524 | valid ppl 2.8646 | learning rate 5.0000
269
+ | end of split 17 / 28 | epoch 10 | time: 3802.37s | valid loss 1.0524 | valid ppl 2.8645 | learning rate 5.0000
270
+ | end of split 18 / 28 | epoch 10 | time: 3805.85s | valid loss 1.0523 | valid ppl 2.8643 | learning rate 5.0000
271
+ | end of split 19 / 28 | epoch 10 | time: 3804.15s | valid loss 1.0524 | valid ppl 2.8644 | learning rate 5.0000
272
+ | end of split 20 / 28 | epoch 10 | time: 3806.41s | valid loss 1.0523 | valid ppl 2.8642 | learning rate 5.0000
273
+ | end of split 21 / 28 | epoch 10 | time: 3809.13s | valid loss 1.0522 | valid ppl 2.8639 | learning rate 5.0000
274
+ | end of split 22 / 28 | epoch 10 | time: 3798.99s | valid loss 1.0523 | valid ppl 2.8641 | learning rate 5.0000
275
+ | end of split 23 / 28 | epoch 10 | time: 3802.76s | valid loss 1.0522 | valid ppl 2.8639 | learning rate 5.0000
276
+ | end of split 24 / 28 | epoch 10 | time: 3805.95s | valid loss 1.0522 | valid ppl 2.8639 | learning rate 5.0000
277
+ | end of split 25 / 28 | epoch 10 | time: 3803.67s | valid loss 1.0522 | valid ppl 2.8639 | learning rate 5.0000
278
+ | end of split 26 / 28 | epoch 10 | time: 3802.75s | valid loss 1.0521 | valid ppl 2.8635 | learning rate 5.0000
279
+ | end of split 27 / 28 | epoch 10 | time: 3804.63s | valid loss 1.0520 | valid ppl 2.8633 | learning rate 5.0000
280
+ | end of split 28 / 28 | epoch 10 | time: 1097.97s | valid loss 1.0520 | valid ppl 2.8634 | learning rate 5.0000
281
+ | end of split 1 / 28 | epoch 11 | time: 3793.51s | valid loss 1.0520 | valid ppl 2.8634 | learning rate 5.0000
282
+ | end of split 2 / 28 | epoch 11 | time: 3802.15s | valid loss 1.0520 | valid ppl 2.8633 | learning rate 5.0000
283
+ | end of split 3 / 28 | epoch 11 | time: 3801.09s | valid loss 1.0518 | valid ppl 2.8629 | learning rate 5.0000
284
+ | end of split 4 / 28 | epoch 11 | time: 3803.88s | valid loss 1.0518 | valid ppl 2.8629 | learning rate 5.0000
285
+ | end of split 5 / 28 | epoch 11 | time: 3803.72s | valid loss 1.0518 | valid ppl 2.8628 | learning rate 5.0000
286
+ | end of split 6 / 28 | epoch 11 | time: 3803.50s | valid loss 1.0518 | valid ppl 2.8629 | learning rate 5.0000
287
+ | end of split 7 / 28 | epoch 11 | time: 3798.93s | valid loss 1.0518 | valid ppl 2.8627 | learning rate 5.0000
288
+ | end of split 8 / 28 | epoch 11 | time: 3798.59s | valid loss 1.0516 | valid ppl 2.8623 | learning rate 5.0000
289
+ | end of split 9 / 28 | epoch 11 | time: 3797.52s | valid loss 1.0517 | valid ppl 2.8624 | learning rate 5.0000
290
+ | end of split 10 / 28 | epoch 11 | time: 3806.92s | valid loss 1.0518 | valid ppl 2.8627 | learning rate 5.0000
291
+ | end of split 11 / 28 | epoch 11 | time: 3806.04s | valid loss 1.0516 | valid ppl 2.8622 | learning rate 5.0000
292
+ | end of split 12 / 28 | epoch 11 | time: 3801.39s | valid loss 1.0519 | valid ppl 2.8632 | learning rate 5.0000
293
+ | end of split 13 / 28 | epoch 11 | time: 3801.24s | valid loss 1.0516 | valid ppl 2.8622 | learning rate 5.0000
294
+ | end of split 14 / 28 | epoch 11 | time: 3804.44s | valid loss 1.0515 | valid ppl 2.8620 | learning rate 5.0000
295
+ | end of split 15 / 28 | epoch 11 | time: 3801.34s | valid loss 1.0515 | valid ppl 2.8620 | learning rate 5.0000
296
+ | end of split 16 / 28 | epoch 11 | time: 3803.14s | valid loss 1.0514 | valid ppl 2.8618 | learning rate 5.0000
297
+ | end of split 17 / 28 | epoch 11 | time: 3801.11s | valid loss 1.0514 | valid ppl 2.8617 | learning rate 5.0000
298
+ | end of split 18 / 28 | epoch 11 | time: 3804.58s | valid loss 1.0513 | valid ppl 2.8613 | learning rate 5.0000
299
+ | end of split 19 / 28 | epoch 11 | time: 3796.04s | valid loss 1.0513 | valid ppl 2.8615 | learning rate 5.0000
300
+ | end of split 20 / 28 | epoch 11 | time: 3797.12s | valid loss 1.0512 | valid ppl 2.8611 | learning rate 5.0000
301
+ | end of split 21 / 28 | epoch 11 | time: 1097.96s | valid loss 1.0512 | valid ppl 2.8612 | learning rate 5.0000
302
+ | end of split 22 / 28 | epoch 11 | time: 3800.79s | valid loss 1.0513 | valid ppl 2.8613 | learning rate 5.0000
303
+ | end of split 23 / 28 | epoch 11 | time: 3801.51s | valid loss 1.0518 | valid ppl 2.8629 | learning rate 5.0000
304
+ | end of split 24 / 28 | epoch 11 | time: 3798.63s | valid loss 1.0513 | valid ppl 2.8614 | learning rate 5.0000
305
+ | end of split 25 / 28 | epoch 11 | time: 3796.99s | valid loss 1.0512 | valid ppl 2.8612 | learning rate 5.0000
306
+ | end of split 26 / 28 | epoch 11 | time: 3797.77s | valid loss 1.0512 | valid ppl 2.8610 | learning rate 5.0000
307
+ | end of split 27 / 28 | epoch 11 | time: 3797.73s | valid loss 1.0512 | valid ppl 2.8610 | learning rate 5.0000
308
+ | end of split 28 / 28 | epoch 11 | time: 3800.03s | valid loss 1.0511 | valid ppl 2.8607 | learning rate 5.0000
309
+ | end of split 1 / 28 | epoch 12 | time: 3796.72s | valid loss 1.0511 | valid ppl 2.8609 | learning rate 5.0000
310
+ | end of split 2 / 28 | epoch 12 | time: 1097.45s | valid loss 1.0510 | valid ppl 2.8604 | learning rate 5.0000
311
+ | end of split 3 / 28 | epoch 12 | time: 3803.10s | valid loss 1.0510 | valid ppl 2.8606 | learning rate 5.0000
312
+ | end of split 4 / 28 | epoch 12 | time: 3803.38s | valid loss 1.0510 | valid ppl 2.8604 | learning rate 5.0000
313
+ | end of split 5 / 28 | epoch 12 | time: 3796.86s | valid loss 1.0509 | valid ppl 2.8602 | learning rate 5.0000
314
+ | end of split 6 / 28 | epoch 12 | time: 3804.85s | valid loss 1.0509 | valid ppl 2.8601 | learning rate 5.0000
315
+ | end of split 7 / 28 | epoch 12 | time: 3804.65s | valid loss 1.0509 | valid ppl 2.8601 | learning rate 5.0000
316
+ | end of split 8 / 28 | epoch 12 | time: 3806.75s | valid loss 1.0508 | valid ppl 2.8599 | learning rate 5.0000
317
+ | end of split 9 / 28 | epoch 12 | time: 3800.05s | valid loss 1.0507 | valid ppl 2.8597 | learning rate 5.0000
318
+ | end of split 10 / 28 | epoch 12 | time: 3802.67s | valid loss 1.0507 | valid ppl 2.8596 | learning rate 5.0000
319
+ | end of split 11 / 28 | epoch 12 | time: 3806.56s | valid loss 1.0508 | valid ppl 2.8598 | learning rate 5.0000
320
+ | end of split 12 / 28 | epoch 12 | time: 3804.49s | valid loss 1.0507 | valid ppl 2.8598 | learning rate 5.0000
321
+ | end of split 13 / 28 | epoch 12 | time: 3804.60s | valid loss 1.0507 | valid ppl 2.8595 | learning rate 5.0000
322
+ | end of split 14 / 28 | epoch 12 | time: 3799.49s | valid loss 1.0506 | valid ppl 2.8594 | learning rate 5.0000
323
+ | end of split 15 / 28 | epoch 12 | time: 3807.23s | valid loss 1.0506 | valid ppl 2.8595 | learning rate 5.0000
324
+ | end of split 16 / 28 | epoch 12 | time: 3798.38s | valid loss 1.0506 | valid ppl 2.8592 | learning rate 5.0000
325
+ | end of split 17 / 28 | epoch 12 | time: 3806.09s | valid loss 1.0506 | valid ppl 2.8595 | learning rate 5.0000
326
+ | end of split 18 / 28 | epoch 12 | time: 3797.37s | valid loss 1.0506 | valid ppl 2.8594 | learning rate 5.0000
327
+ | end of split 19 / 28 | epoch 12 | time: 3800.94s | valid loss 1.0505 | valid ppl 2.8589 | learning rate 5.0000
328
+ | end of split 20 / 28 | epoch 12 | time: 3796.71s | valid loss 1.0505 | valid ppl 2.8590 | learning rate 5.0000
329
+ | end of split 21 / 28 | epoch 12 | time: 3795.95s | valid loss 1.0504 | valid ppl 2.8588 | learning rate 5.0000
330
+ | end of split 22 / 28 | epoch 12 | time: 3793.39s | valid loss 1.0504 | valid ppl 2.8588 | learning rate 5.0000
331
+ | end of split 23 / 28 | epoch 12 | time: 3797.13s | valid loss 1.0503 | valid ppl 2.8586 | learning rate 5.0000
332
+ | end of split 24 / 28 | epoch 12 | time: 3802.93s | valid loss 1.0503 | valid ppl 2.8586 | learning rate 5.0000
333
+ | end of split 25 / 28 | epoch 12 | time: 3798.55s | valid loss 1.0502 | valid ppl 2.8582 | learning rate 5.0000
334
+ | end of split 26 / 28 | epoch 12 | time: 3797.73s | valid loss 1.0502 | valid ppl 2.8582 | learning rate 5.0000
335
+ | end of split 27 / 28 | epoch 12 | time: 3798.53s | valid loss 1.0502 | valid ppl 2.8582 | learning rate 5.0000
336
+ | end of split 28 / 28 | epoch 12 | time: 3797.17s | valid loss 1.0502 | valid ppl 2.8582 | learning rate 5.0000
pipeline.py ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import List, Dict
2
+ from flair.models.language_model import LanguageModel
3
+
4
+
5
+ class PreTrainedPipeline:
6
+ def __init__(self, path=""):
7
+ from huggingface_hub import hf_hub_download
8
+
9
+ self.model = LanguageModel.load_language_model(
10
+ hf_hub_download(repo_id="dchaplinsky/flair-uk-backward", filename="best-lm.pt")
11
+ )
12
+
13
+ def __call__(self, inputs: str) -> List[Dict]:
14
+ """
15
+ Args:
16
+ inputs (:obj:`str`):
17
+ a string containing some text
18
+ Return:
19
+ A :obj:`str`
20
+ """
21
+ inputs = inputs.strip()
22
+ return [{"generated_text": self.model.generate_text(inputs[::-1], temperature=0.5)[0]}]
requirements.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ flair