v2.0 with teacher all-mpnet-base-v2, trained with longer paragraphs
Browse files- README.md +54 -6
- config.json +1 -1
- eval/mse_evaluation_TED2020-en-sv-dev.tsv.gz_results.csv +116 -123
- eval/mse_evaluation_Tatoeba-eng-swe-dev.tsv.gz_results.csv +116 -123
- eval/translation_evaluation_TED2020-en-sv-dev.tsv.gz_results.csv +116 -123
- eval/translation_evaluation_Tatoeba-eng-swe-dev.tsv.gz_results.csv +116 -123
- pytorch_model.bin +1 -1
- tokenizer_config.json +1 -1
README.md
CHANGED
@@ -35,10 +35,18 @@ widget:
|
|
35 |
|
36 |
# KBLab/sentence-bert-swedish-cased
|
37 |
|
38 |
-
This is a [sentence-transformers](https://www.SBERT.net) model: It maps Swedish sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search. This model is a bilingual Swedish-English model trained according to instructions in the paper [Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation](https://arxiv.org/pdf/2004.09813.pdf) and the [documentation](https://www.sbert.net/examples/training/multilingual/README.html) accompanying its companion python package. We have used the strongest available pretrained English Bi-Encoder ([
|
39 |
|
40 |
A more detailed description of the model can be found in an article we published on the [KBLab blog](https://kb-labb.github.io/posts/2021-08-23-a-swedish-sentence-transformer/).
|
41 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
42 |
<!--- Describe your model here -->
|
43 |
|
44 |
## Usage (Sentence-Transformers)
|
@@ -81,6 +89,7 @@ def mean_pooling(model_output, attention_mask):
|
|
81 |
sentences = ['Det här är en exempelmening', 'Varje exempel blir konverterad']
|
82 |
|
83 |
# Load model from HuggingFace Hub
|
|
|
84 |
tokenizer = AutoTokenizer.from_pretrained('KBLab/sentence-bert-swedish-cased')
|
85 |
model = AutoModel.from_pretrained('KBLab/sentence-bert-swedish-cased')
|
86 |
|
@@ -98,13 +107,19 @@ print("Sentence embeddings:")
|
|
98 |
print(sentence_embeddings)
|
99 |
```
|
100 |
|
101 |
-
|
102 |
|
103 |
## Evaluation Results
|
104 |
|
105 |
<!--- Describe how your model was evaluated -->
|
106 |
|
107 |
-
The model was
|
|
|
|
|
|
|
|
|
|
|
|
|
108 |
|
109 |
The following code snippet can be used to reproduce the above results:
|
110 |
|
@@ -149,13 +164,46 @@ print(df[["score", "model_score"]].corr(method="spearman"))
|
|
149 |
print(df[["score", "model_score"]].corr(method="pearson"))
|
150 |
```
|
151 |
|
152 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
153 |
|
154 |
## Training
|
155 |
|
156 |
-
An article with more details on data and the model can be found on the [KBLab blog](https://kb-labb.github.io/posts/2021-08-23-a-swedish-sentence-transformer/).
|
157 |
|
158 |
-
Around 14.6 million sentences from English-Swedish parallel corpuses were used to train the model. Data was sourced from the [Open Parallel Corpus](https://opus.nlpl.eu/) (OPUS) and downloaded via the python package [opustools](https://pypi.org/project/opustools/). Datasets used were: JW300, Europarl,
|
159 |
|
160 |
The model was trained with the parameters:
|
161 |
|
|
|
35 |
|
36 |
# KBLab/sentence-bert-swedish-cased
|
37 |
|
38 |
+
This is a [sentence-transformers](https://www.SBERT.net) model: It maps Swedish sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search. This model is a bilingual Swedish-English model trained according to instructions in the paper [Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation](https://arxiv.org/pdf/2004.09813.pdf) and the [documentation](https://www.sbert.net/examples/training/multilingual/README.html) accompanying its companion python package. We have used the strongest available pretrained English Bi-Encoder ([all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2)) as a teacher model, and the pretrained Swedish [KB-BERT](https://huggingface.co/KB/bert-base-swedish-cased) as the student model.
|
39 |
|
40 |
A more detailed description of the model can be found in an article we published on the [KBLab blog](https://kb-labb.github.io/posts/2021-08-23-a-swedish-sentence-transformer/).
|
41 |
|
42 |
+
**Update**: We have released updated versions of the model since the initial release. The original model described in the blog post is **v1.0**. The current version is **v2.0**. The newer versions are trained on longer paragraphs, and have a longer max sequence length. **v2.0** is trained with a stronger teacher model and is the current default.
|
43 |
+
|
44 |
+
| Model version | Teacher Model | Max Sequence Length |
|
45 |
+
|---------------|---------|----------|
|
46 |
+
| v1.0 | [paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) | 256 |
|
47 |
+
| v1.1 | [paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) | 384 |
|
48 |
+
| v2.0 | [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) | 384 |
|
49 |
+
|
50 |
<!--- Describe your model here -->
|
51 |
|
52 |
## Usage (Sentence-Transformers)
|
|
|
89 |
sentences = ['Det här är en exempelmening', 'Varje exempel blir konverterad']
|
90 |
|
91 |
# Load model from HuggingFace Hub
|
92 |
+
# To load an older version, e.g. v1.0, add the argument revision="v1.0"
|
93 |
tokenizer = AutoTokenizer.from_pretrained('KBLab/sentence-bert-swedish-cased')
|
94 |
model = AutoModel.from_pretrained('KBLab/sentence-bert-swedish-cased')
|
95 |
|
|
|
107 |
print(sentence_embeddings)
|
108 |
```
|
109 |
|
110 |
+
To load an older model specify the version tag with the `revision` arg: `AutoTokenizer.from_pretrained('KBLab/sentence-bert-swedish-cased', revision="v1.0")`.
|
111 |
|
112 |
## Evaluation Results
|
113 |
|
114 |
<!--- Describe how your model was evaluated -->
|
115 |
|
116 |
+
The model was evaluated on [SweParaphrase v1.0](https://spraakbanken.gu.se/en/resources/sweparaphrase) and **SweParaphrase v2.0**. This test set is part of [SuperLim](https://spraakbanken.gu.se/en/resources/superlim) -- a Swedish evaluation suite for natural langage understanding tasks. We calculated Pearson and Spearman correlation between predicted model similarity scores and the human similarity score labels. Results from **SweParaphrase v1.0** are displayed below.
|
117 |
+
|
118 |
+
| Model version | Pearson | Spearman |
|
119 |
+
|---------------|---------|----------|
|
120 |
+
| v1.0 | 0.9183 | 0.9114 |
|
121 |
+
| v1.1 | 0.9183 | 0.9114 |
|
122 |
+
| v2.0 | **0.9283** | **0.9130** |
|
123 |
|
124 |
The following code snippet can be used to reproduce the above results:
|
125 |
|
|
|
164 |
print(df[["score", "model_score"]].corr(method="pearson"))
|
165 |
```
|
166 |
|
167 |
+
### Sweparaphrase v2.0
|
168 |
+
|
169 |
+
In general, **v1.1** correlates the most with human assessment of text similarity on SweParaphrase v2.0. Below, we present zero-shot evaluation results on all data splits. They display the model's performance out of the box, without any fine-tuning.
|
170 |
+
|
171 |
+
| Model version | Data split | Pearson | Spearman |
|
172 |
+
|---------------|------------|------------|------------|
|
173 |
+
| v1.0 | train | 0.8355 | 0.8256 |
|
174 |
+
| v1.1 | train | **0.8383** | **0.8302** |
|
175 |
+
| v2.0 | train | 0.8209 | 0.8059 |
|
176 |
+
| v1.0 | dev | 0.8682 | 0.8774 |
|
177 |
+
| v1.1 | dev | **0.8739** | **0.8833** |
|
178 |
+
| v2.0 | dev | 0.8638 | 0.8668 |
|
179 |
+
| v1.0 | test | 0.8356 | 0.8476 |
|
180 |
+
| v1.1 | test | **0.8393** | **0.8550** |
|
181 |
+
| v2.0 | test | 0.8232 | 0.8213 |
|
182 |
+
|
183 |
+
### SweFAQ v2.0
|
184 |
+
|
185 |
+
When it comes to retrieval tasks, **v2.0** performs the best by quite a substantial margin. It is better at matching the correct answer to a question compared to v1.1 and v1.0.
|
186 |
+
|
187 |
+
| Model version | Data split | Accuracy |
|
188 |
+
|---------------|------------|------------|
|
189 |
+
| v1.0 | train | 0.5262 |
|
190 |
+
| v1.1 | train | 0.6236 |
|
191 |
+
| v2.0 | train | **0.7106** |
|
192 |
+
| v1.0 | dev | 0.4636 |
|
193 |
+
| v1.1 | dev | 0.5818 |
|
194 |
+
| v2.0 | dev | **0.6727** |
|
195 |
+
| v1.0 | test | 0.4495 |
|
196 |
+
| v1.1 | test | 0.5229 |
|
197 |
+
| v2.0 | test | **0.5871** |
|
198 |
+
|
199 |
+
|
200 |
+
Examples how to evaluate the models on some of the test sets of the SuperLim suites can be found on the following links: [evaluate_faq.py](https://github.com/kb-labb/swedish-sbert/blob/main/evaluate_faq.py) (Swedish FAQ), [evaluate_swesat.py](https://github.com/kb-labb/swedish-sbert/blob/main/evaluate_swesat.py) (SweSAT synonyms), [evaluate_supersim.py](https://github.com/kb-labb/swedish-sbert/blob/main/evaluate_supersim.py) (SuperSim).
|
201 |
|
202 |
## Training
|
203 |
|
204 |
+
An article with more details on data and v1.0 of the model can be found on the [KBLab blog](https://kb-labb.github.io/posts/2021-08-23-a-swedish-sentence-transformer/).
|
205 |
|
206 |
+
Around 14.6 million sentences from English-Swedish parallel corpuses were used to train the model. Data was sourced from the [Open Parallel Corpus](https://opus.nlpl.eu/) (OPUS) and downloaded via the python package [opustools](https://pypi.org/project/opustools/). Datasets used were: JW300, Europarl, DGT-TM, EMEA, ELITR-ECA, TED2020, Tatoeba and OpenSubtitles.
|
207 |
|
208 |
The model was trained with the parameters:
|
209 |
|
config.json
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
{
|
2 |
-
"_name_or_path": "output/
|
3 |
"architectures": [
|
4 |
"BertModel"
|
5 |
],
|
|
|
1 |
{
|
2 |
+
"_name_or_path": "output/no-normalize-en-sv-2022-12-26_18-39-42/",
|
3 |
"architectures": [
|
4 |
"BertModel"
|
5 |
],
|
eval/mse_evaluation_TED2020-en-sv-dev.tsv.gz_results.csv
CHANGED
@@ -1,124 +1,117 @@
|
|
1 |
epoch,steps,MSE
|
2 |
-
0,1000,0.
|
3 |
-
0,2000,0.
|
4 |
-
0,3000,0.
|
5 |
-
0,4000,0.
|
6 |
-
0,5000,0.
|
7 |
-
0,6000,0.
|
8 |
-
0,7000,0.
|
9 |
-
0,8000,0.
|
10 |
-
0,9000,0.
|
11 |
-
0,10000,0.
|
12 |
-
0,11000,0.
|
13 |
-
0,12000,0.
|
14 |
-
0,13000,0.
|
15 |
-
0,14000,0.
|
16 |
-
0,15000,0.
|
17 |
-
0,16000,0.
|
18 |
-
0,17000,0.
|
19 |
-
0,18000,0.
|
20 |
-
0,19000,0.
|
21 |
-
0,20000,0.
|
22 |
-
0,21000,0.
|
23 |
-
0,22000,0.
|
24 |
-
0,23000,0.
|
25 |
-
0,24000,0.
|
26 |
-
0,25000,0.
|
27 |
-
0,26000,0.
|
28 |
-
0,27000,0.
|
29 |
-
0,28000,0.
|
30 |
-
0,29000,0.
|
31 |
-
0,30000,0.
|
32 |
-
0,31000,0.
|
33 |
-
0,32000,0.
|
34 |
-
0,33000,0.
|
35 |
-
0,34000,0.
|
36 |
-
0,35000,0.
|
37 |
-
0,36000,0.
|
38 |
-
0,37000,0.
|
39 |
-
0,38000,0.
|
40 |
-
0,39000,0.
|
41 |
-
0,40000,0.
|
42 |
-
0,41000,0.
|
43 |
-
0,42000,0.
|
44 |
-
0,43000,0.
|
45 |
-
0,44000,0.
|
46 |
-
0,45000,0.
|
47 |
-
0,46000,0.
|
48 |
-
0,47000,0.
|
49 |
-
0,48000,0.
|
50 |
-
0,49000,0.
|
51 |
-
0,50000,0.
|
52 |
-
0,51000,0.
|
53 |
-
0,52000,0.
|
54 |
-
0,53000,0.
|
55 |
-
0,54000,0.
|
56 |
-
0,55000,0.
|
57 |
-
0,56000,0.
|
58 |
-
0,57000,0.
|
59 |
-
0,58000,0.
|
60 |
-
0,59000,0.
|
61 |
-
0,60000,0.
|
62 |
-
0,61000,0.
|
63 |
-
0,62000,0.
|
64 |
-
0,63000,0.
|
65 |
-
0,64000,0.
|
66 |
-
0,65000,0.
|
67 |
-
0,66000,0.
|
68 |
-
0,67000,0.
|
69 |
-
0,68000,0.
|
70 |
-
0,69000,0.
|
71 |
-
0,70000,0.
|
72 |
-
0,71000,0.
|
73 |
-
0,72000,0.
|
74 |
-
0,73000,0.
|
75 |
-
0,74000,0.
|
76 |
-
0,75000,0.
|
77 |
-
0,76000,0.
|
78 |
-
0,77000,0.
|
79 |
-
0,78000,0.
|
80 |
-
0,79000,0.
|
81 |
-
0,80000,0.
|
82 |
-
0,81000,0.
|
83 |
-
0,82000,0.
|
84 |
-
0,83000,0.
|
85 |
-
0,84000,0.
|
86 |
-
0,85000,0.
|
87 |
-
0,86000,0.
|
88 |
-
0,87000,0.
|
89 |
-
0,88000,0.
|
90 |
-
0,89000,0.
|
91 |
-
0,90000,0.
|
92 |
-
0,91000,0.
|
93 |
-
0,92000,0.
|
94 |
-
0,93000,0.
|
95 |
-
0,94000,0.
|
96 |
-
0,95000,0.
|
97 |
-
0,96000,0.
|
98 |
-
0,97000,0.
|
99 |
-
0,98000,0.
|
100 |
-
0,99000,0.
|
101 |
-
0,100000,0.
|
102 |
-
0,101000,0.
|
103 |
-
0,102000,0.
|
104 |
-
0,103000,0.
|
105 |
-
0,104000,0.
|
106 |
-
0,105000,0.
|
107 |
-
0,106000,0.
|
108 |
-
0,107000,0.
|
109 |
-
0,108000,0.
|
110 |
-
0,109000,0.
|
111 |
-
0,110000,0.
|
112 |
-
0,111000,0.
|
113 |
-
0,112000,0.
|
114 |
-
0,113000,0.
|
115 |
-
0,114000,0.
|
116 |
-
0,115000,0.
|
117 |
-
0,116000,0.
|
118 |
-
0,117000,0.3408001968637109
|
119 |
-
0,118000,0.3406590083613992
|
120 |
-
0,119000,0.34159310162067413
|
121 |
-
0,120000,0.3412698395550251
|
122 |
-
0,121000,0.3411141689866781
|
123 |
-
0,122000,0.34038866870105267
|
124 |
-
0,123000,0.34109046682715416
|
|
|
1 |
epoch,steps,MSE
|
2 |
+
0,1000,0.3305374411866069
|
3 |
+
0,2000,0.3307490376755595
|
4 |
+
0,3000,0.33029515761882067
|
5 |
+
0,4000,0.3363496856763959
|
6 |
+
0,5000,0.33181568142026663
|
7 |
+
0,6000,0.33005450386554
|
8 |
+
0,7000,0.3400696674361825
|
9 |
+
0,8000,0.3337323432788253
|
10 |
+
0,9000,0.3324012039229274
|
11 |
+
0,10000,0.3307985607534647
|
12 |
+
0,11000,0.33261950593441725
|
13 |
+
0,12000,0.3310671541839838
|
14 |
+
0,13000,0.3318543080240488
|
15 |
+
0,14000,0.33143835607916117
|
16 |
+
0,15000,0.33053881488740444
|
17 |
+
0,16000,0.33193877898156643
|
18 |
+
0,17000,0.33089425414800644
|
19 |
+
0,18000,0.3306396072730422
|
20 |
+
0,19000,0.3306496189907193
|
21 |
+
0,20000,0.3303301753476262
|
22 |
+
0,21000,0.33140445593744516
|
23 |
+
0,22000,0.33020293340086937
|
24 |
+
0,23000,0.3308977698907256
|
25 |
+
0,24000,0.3299850504845381
|
26 |
+
0,25000,0.3310497384518385
|
27 |
+
0,26000,0.33014328218996525
|
28 |
+
0,27000,0.3300098003819585
|
29 |
+
0,28000,0.3302561352029443
|
30 |
+
0,29000,0.33076435793191195
|
31 |
+
0,30000,0.33028211910277605
|
32 |
+
0,31000,0.33056996762752533
|
33 |
+
0,32000,0.3295192960649729
|
34 |
+
0,33000,0.32972143962979317
|
35 |
+
0,34000,0.3295718925073743
|
36 |
+
0,35000,0.3292877459898591
|
37 |
+
0,36000,0.32971047330647707
|
38 |
+
0,37000,0.3298777621239424
|
39 |
+
0,38000,0.32897726632654667
|
40 |
+
0,39000,0.32850170973688364
|
41 |
+
0,40000,0.32882699742913246
|
42 |
+
0,41000,0.3295514499768615
|
43 |
+
0,42000,0.3299229545518756
|
44 |
+
0,43000,0.3288676030933857
|
45 |
+
0,44000,0.3289586631581187
|
46 |
+
0,45000,0.3293627640232444
|
47 |
+
0,46000,0.3293595276772976
|
48 |
+
0,47000,0.32900278456509113
|
49 |
+
0,48000,0.3292385721579194
|
50 |
+
0,49000,0.3301975317299366
|
51 |
+
0,50000,0.3289953572675586
|
52 |
+
0,51000,0.3290593158453703
|
53 |
+
0,52000,0.32869731076061726
|
54 |
+
0,53000,0.32881496008485556
|
55 |
+
0,54000,0.32918842043727636
|
56 |
+
0,55000,0.32880869694054127
|
57 |
+
0,56000,0.3290021326392889
|
58 |
+
0,57000,0.3285171231254935
|
59 |
+
0,58000,0.3282122313976288
|
60 |
+
0,59000,0.32952832989394665
|
61 |
+
0,60000,0.3284840611740947
|
62 |
+
0,61000,0.32849146518856287
|
63 |
+
0,62000,0.3292925423011184
|
64 |
+
0,63000,0.32826855313032866
|
65 |
+
0,64000,0.32868378330022097
|
66 |
+
0,65000,0.3289590124040842
|
67 |
+
0,66000,0.32914113253355026
|
68 |
+
0,67000,0.32902946695685387
|
69 |
+
0,68000,0.327494740486145
|
70 |
+
0,69000,0.32870552968233824
|
71 |
+
0,70000,0.32899840734899044
|
72 |
+
0,71000,0.3288975451141596
|
73 |
+
0,72000,0.32872746232897043
|
74 |
+
0,73000,0.3284345380961895
|
75 |
+
0,74000,0.32932700123637915
|
76 |
+
0,75000,0.3289450891315937
|
77 |
+
0,76000,0.32835565507411957
|
78 |
+
0,77000,0.3284606384113431
|
79 |
+
0,78000,0.3285201732069254
|
80 |
+
0,79000,0.3282968420535326
|
81 |
+
0,80000,0.3280844073742628
|
82 |
+
0,81000,0.3282552817836404
|
83 |
+
0,82000,0.32890671864151955
|
84 |
+
0,83000,0.3278125077486038
|
85 |
+
0,84000,0.32847451511770487
|
86 |
+
0,85000,0.3284695325419307
|
87 |
+
0,86000,0.3288332372903824
|
88 |
+
0,87000,0.32888082787394524
|
89 |
+
0,88000,0.32766179647296667
|
90 |
+
0,89000,0.3283764934167266
|
91 |
+
0,90000,0.32793665304780006
|
92 |
+
0,91000,0.32725471537560225
|
93 |
+
0,92000,0.3277936251834035
|
94 |
+
0,93000,0.3274726215749979
|
95 |
+
0,94000,0.32755015417933464
|
96 |
+
0,95000,0.3280520439147949
|
97 |
+
0,96000,0.3282654797658324
|
98 |
+
0,97000,0.32788922544568777
|
99 |
+
0,98000,0.32690519001334906
|
100 |
+
0,99000,0.32813241705298424
|
101 |
+
0,100000,0.3279812401160598
|
102 |
+
0,101000,0.32842066138982773
|
103 |
+
0,102000,0.3276278730481863
|
104 |
+
0,103000,0.32748677767813206
|
105 |
+
0,104000,0.3282419638708234
|
106 |
+
0,105000,0.3277064301073551
|
107 |
+
0,106000,0.32805497758090496
|
108 |
+
0,107000,0.3275437746196985
|
109 |
+
0,108000,0.32795085571706295
|
110 |
+
0,109000,0.32730509992688894
|
111 |
+
0,110000,0.32666658516973257
|
112 |
+
0,111000,0.3273332491517067
|
113 |
+
0,112000,0.32759017776697874
|
114 |
+
0,113000,0.32762193586677313
|
115 |
+
0,114000,0.32658560667186975
|
116 |
+
0,115000,0.3273524809628725
|
117 |
+
0,116000,0.3271533874794841
|
|
|
|
|
|
|
|
|
|
|
|
|
|
eval/mse_evaluation_Tatoeba-eng-swe-dev.tsv.gz_results.csv
CHANGED
@@ -1,124 +1,117 @@
|
|
1 |
epoch,steps,MSE
|
2 |
-
0,1000,0.
|
3 |
-
0,2000,0.
|
4 |
-
0,3000,0.
|
5 |
-
0,4000,0.
|
6 |
-
0,5000,0.
|
7 |
-
0,6000,0.
|
8 |
-
0,7000,0.
|
9 |
-
0,8000,0.
|
10 |
-
0,9000,0.
|
11 |
-
0,10000,0.
|
12 |
-
0,11000,0.
|
13 |
-
0,12000,0.
|
14 |
-
0,13000,0.
|
15 |
-
0,14000,0.
|
16 |
-
0,15000,0.
|
17 |
-
0,16000,0.
|
18 |
-
0,17000,0.
|
19 |
-
0,18000,0.
|
20 |
-
0,19000,0.
|
21 |
-
0,20000,0.
|
22 |
-
0,21000,0.
|
23 |
-
0,22000,0.
|
24 |
-
0,23000,0.
|
25 |
-
0,24000,0.
|
26 |
-
0,25000,0.
|
27 |
-
0,26000,0.
|
28 |
-
0,27000,0.
|
29 |
-
0,28000,0.
|
30 |
-
0,29000,0.
|
31 |
-
0,30000,0.
|
32 |
-
0,31000,0.
|
33 |
-
0,32000,0.
|
34 |
-
0,33000,0.
|
35 |
-
0,34000,0.
|
36 |
-
0,35000,0.
|
37 |
-
0,36000,0.
|
38 |
-
0,37000,0.
|
39 |
-
0,38000,0.
|
40 |
-
0,39000,0.
|
41 |
-
0,40000,0.
|
42 |
-
0,41000,0.
|
43 |
-
0,42000,0.
|
44 |
-
0,43000,0.
|
45 |
-
0,44000,0.
|
46 |
-
0,45000,0.
|
47 |
-
0,46000,0.
|
48 |
-
0,47000,0.
|
49 |
-
0,48000,0.
|
50 |
-
0,49000,0.
|
51 |
-
0,50000,0.
|
52 |
-
0,51000,0.
|
53 |
-
0,52000,0.
|
54 |
-
0,53000,0.
|
55 |
-
0,54000,0.
|
56 |
-
0,55000,0.
|
57 |
-
0,56000,0.
|
58 |
-
0,57000,0.
|
59 |
-
0,58000,0.
|
60 |
-
0,59000,0.
|
61 |
-
0,60000,0.
|
62 |
-
0,61000,0.
|
63 |
-
0,62000,0.
|
64 |
-
0,63000,0.
|
65 |
-
0,64000,0.
|
66 |
-
0,65000,0.
|
67 |
-
0,66000,0.
|
68 |
-
0,67000,0.
|
69 |
-
0,68000,0.
|
70 |
-
0,69000,0.
|
71 |
-
0,70000,0.
|
72 |
-
0,71000,0.
|
73 |
-
0,72000,0.
|
74 |
-
0,73000,0.
|
75 |
-
0,74000,0.
|
76 |
-
0,75000,0.
|
77 |
-
0,76000,0.
|
78 |
-
0,77000,0.
|
79 |
-
0,78000,0.
|
80 |
-
0,79000,0.
|
81 |
-
0,80000,0.
|
82 |
-
0,81000,0.
|
83 |
-
0,82000,0.
|
84 |
-
0,83000,0.
|
85 |
-
0,84000,0.
|
86 |
-
0,85000,0.
|
87 |
-
0,86000,0.
|
88 |
-
0,87000,0.
|
89 |
-
0,88000,0.
|
90 |
-
0,89000,0.
|
91 |
-
0,90000,0.
|
92 |
-
0,91000,0.
|
93 |
-
0,92000,0.
|
94 |
-
0,93000,0.
|
95 |
-
0,94000,0.
|
96 |
-
0,95000,0.
|
97 |
-
0,96000,0.
|
98 |
-
0,97000,0.
|
99 |
-
0,98000,0.
|
100 |
-
0,99000,0.
|
101 |
-
0,100000,0.
|
102 |
-
0,101000,0.
|
103 |
-
0,102000,0.
|
104 |
-
0,103000,0.
|
105 |
-
0,104000,0.
|
106 |
-
0,105000,0.
|
107 |
-
0,106000,0.
|
108 |
-
0,107000,0.
|
109 |
-
0,108000,0.
|
110 |
-
0,109000,0.
|
111 |
-
0,110000,0.
|
112 |
-
0,111000,0.
|
113 |
-
0,112000,0.
|
114 |
-
0,113000,0.
|
115 |
-
0,114000,0.
|
116 |
-
0,115000,0.
|
117 |
-
0,116000,0.
|
118 |
-
0,117000,0.26732038240879774
|
119 |
-
0,118000,0.2673294395208359
|
120 |
-
0,119000,0.26773312129080296
|
121 |
-
0,120000,0.26698445435613394
|
122 |
-
0,121000,0.2667626366019249
|
123 |
-
0,122000,0.2666329964995384
|
124 |
-
0,123000,0.2665518783032894
|
|
|
1 |
epoch,steps,MSE
|
2 |
+
0,1000,0.24785797577351332
|
3 |
+
0,2000,0.2478782320395112
|
4 |
+
0,3000,0.24820237886160612
|
5 |
+
0,4000,0.25052379351109266
|
6 |
+
0,5000,0.2484829630702734
|
7 |
+
0,6000,0.24821306578814983
|
8 |
+
0,7000,0.25101194623857737
|
9 |
+
0,8000,0.24918639101088047
|
10 |
+
0,9000,0.24825208820402622
|
11 |
+
0,10000,0.24815460201352835
|
12 |
+
0,11000,0.24851374328136444
|
13 |
+
0,12000,0.24818778038024902
|
14 |
+
0,13000,0.247632572427392
|
15 |
+
0,14000,0.24752533063292503
|
16 |
+
0,15000,0.24750018492341042
|
17 |
+
0,16000,0.24753324687480927
|
18 |
+
0,17000,0.24697233457118273
|
19 |
+
0,18000,0.24762307293713093
|
20 |
+
0,19000,0.24796028155833483
|
21 |
+
0,20000,0.24713336024433374
|
22 |
+
0,21000,0.2480252180248499
|
23 |
+
0,22000,0.24814684875309467
|
24 |
+
0,23000,0.24748872965574265
|
25 |
+
0,24000,0.24726693518459797
|
26 |
+
0,25000,0.24767774157226086
|
27 |
+
0,26000,0.2473350614309311
|
28 |
+
0,27000,0.24677005130797625
|
29 |
+
0,28000,0.24716746993362904
|
30 |
+
0,29000,0.24732353631407022
|
31 |
+
0,30000,0.2474617213010788
|
32 |
+
0,31000,0.24711331352591515
|
33 |
+
0,32000,0.24705547839403152
|
34 |
+
0,33000,0.24687713012099266
|
35 |
+
0,34000,0.24697906337678432
|
36 |
+
0,35000,0.24673829320818186
|
37 |
+
0,36000,0.24703482631593943
|
38 |
+
0,37000,0.24725922849029303
|
39 |
+
0,38000,0.24706728290766478
|
40 |
+
0,39000,0.2470718463882804
|
41 |
+
0,40000,0.24706239346414804
|
42 |
+
0,41000,0.247084628790617
|
43 |
+
0,42000,0.24669873528182507
|
44 |
+
0,43000,0.2467589918524027
|
45 |
+
0,44000,0.24676024913787842
|
46 |
+
0,45000,0.24646525271236897
|
47 |
+
0,46000,0.24657496251165867
|
48 |
+
0,47000,0.245952932164073
|
49 |
+
0,48000,0.24603260681033134
|
50 |
+
0,49000,0.2463929122313857
|
51 |
+
0,50000,0.24623936042189598
|
52 |
+
0,51000,0.24639982730150223
|
53 |
+
0,52000,0.2464748453348875
|
54 |
+
0,53000,0.24611358530819416
|
55 |
+
0,54000,0.24669363629072905
|
56 |
+
0,55000,0.2464905148372054
|
57 |
+
0,56000,0.24694388266652822
|
58 |
+
0,57000,0.24669785052537918
|
59 |
+
0,58000,0.24613626301288605
|
60 |
+
0,59000,0.2463148208335042
|
61 |
+
0,60000,0.24608178064227104
|
62 |
+
0,61000,0.24618220049887896
|
63 |
+
0,62000,0.24709079880267382
|
64 |
+
0,63000,0.2463518874719739
|
65 |
+
0,64000,0.24638932663947344
|
66 |
+
0,65000,0.2465276513248682
|
67 |
+
0,66000,0.24644010700285435
|
68 |
+
0,67000,0.24675603490322828
|
69 |
+
0,68000,0.2460342599079013
|
70 |
+
0,69000,0.24680779315531254
|
71 |
+
0,70000,0.24674353189766407
|
72 |
+
0,71000,0.24644036311656237
|
73 |
+
0,72000,0.24643116630613804
|
74 |
+
0,73000,0.24601935874670744
|
75 |
+
0,74000,0.24650872219353914
|
76 |
+
0,75000,0.2464913995936513
|
77 |
+
0,76000,0.24664695374667645
|
78 |
+
0,77000,0.24642888456583023
|
79 |
+
0,78000,0.24638038594275713
|
80 |
+
0,79000,0.24592014960944653
|
81 |
+
0,80000,0.24589370004832745
|
82 |
+
0,81000,0.2457715105265379
|
83 |
+
0,82000,0.24643635842949152
|
84 |
+
0,83000,0.24539267178624868
|
85 |
+
0,84000,0.24630383122712374
|
86 |
+
0,85000,0.2461036667227745
|
87 |
+
0,86000,0.24639442563056946
|
88 |
+
0,87000,0.24640655610710382
|
89 |
+
0,88000,0.2456542570143938
|
90 |
+
0,89000,0.2460445510223508
|
91 |
+
0,90000,0.24578433949500322
|
92 |
+
0,91000,0.24577109143137932
|
93 |
+
0,92000,0.24596715811640024
|
94 |
+
0,93000,0.2458097180351615
|
95 |
+
0,94000,0.24577626027166843
|
96 |
+
0,95000,0.24602224584668875
|
97 |
+
0,96000,0.24567588698118925
|
98 |
+
0,97000,0.24607458617538214
|
99 |
+
0,98000,0.24560708552598953
|
100 |
+
0,99000,0.2458440838381648
|
101 |
+
0,100000,0.24566156789660454
|
102 |
+
0,101000,0.24600222241133451
|
103 |
+
0,102000,0.24566147476434708
|
104 |
+
0,103000,0.2461111405864358
|
105 |
+
0,104000,0.2459018025547266
|
106 |
+
0,105000,0.24577155709266663
|
107 |
+
0,106000,0.2457206603139639
|
108 |
+
0,107000,0.24565793573856354
|
109 |
+
0,108000,0.24591167457401752
|
110 |
+
0,109000,0.24543707258999348
|
111 |
+
0,110000,0.24535527918487787
|
112 |
+
0,111000,0.2456084592267871
|
113 |
+
0,112000,0.24576487485319376
|
114 |
+
0,113000,0.24569914676249027
|
115 |
+
0,114000,0.24549257941544056
|
116 |
+
0,115000,0.24552540853619576
|
117 |
+
0,116000,0.2456445712596178
|
|
|
|
|
|
|
|
|
|
|
|
|
|
eval/translation_evaluation_TED2020-en-sv-dev.tsv.gz_results.csv
CHANGED
@@ -1,124 +1,117 @@
|
|
1 |
epoch,steps,src2trg,trg2src
|
2 |
-
0,1000,0.
|
3 |
-
0,2000,0.
|
4 |
-
0,3000,0.
|
5 |
-
0,4000,0.
|
6 |
-
0,5000,0.
|
7 |
-
0,6000,0.
|
8 |
-
0,7000,0.
|
9 |
-
0,8000,0.
|
10 |
-
0,9000,0.
|
11 |
-
0,10000,0.
|
12 |
-
0,11000,0.
|
13 |
-
0,12000,0.
|
14 |
-
0,13000,0.
|
15 |
-
0,14000,0.
|
16 |
-
0,15000,0.
|
17 |
-
0,16000,0.
|
18 |
-
0,17000,0.
|
19 |
-
0,18000,0.
|
20 |
-
0,19000,0.
|
21 |
-
0,20000,0.
|
22 |
-
0,21000,0.
|
23 |
-
0,22000,0.
|
24 |
-
0,23000,0.
|
25 |
-
0,24000,0.
|
26 |
-
0,25000,0.
|
27 |
-
0,26000,0.
|
28 |
-
0,27000,0.
|
29 |
-
0,28000,0.
|
30 |
-
0,29000,0.
|
31 |
-
0,30000,0.
|
32 |
-
0,31000,0.
|
33 |
-
0,32000,0.
|
34 |
-
0,33000,0.
|
35 |
-
0,34000,0.
|
36 |
-
0,35000,0.
|
37 |
-
0,36000,0.
|
38 |
-
0,37000,0.
|
39 |
-
0,38000,0.
|
40 |
-
0,39000,0.
|
41 |
-
0,40000,0.
|
42 |
-
0,41000,0.
|
43 |
-
0,42000,0.
|
44 |
-
0,43000,0.
|
45 |
-
0,44000,0.
|
46 |
-
0,45000,0.
|
47 |
-
0,46000,0.
|
48 |
-
0,47000,0.
|
49 |
-
0,48000,0.
|
50 |
-
0,49000,0.
|
51 |
-
0,50000,0.
|
52 |
-
0,51000,0.
|
53 |
-
0,52000,0.
|
54 |
-
0,53000,0.
|
55 |
-
0,54000,0.
|
56 |
-
0,55000,0.
|
57 |
-
0,56000,0.
|
58 |
-
0,57000,0.
|
59 |
-
0,58000,0.
|
60 |
-
0,59000,0.
|
61 |
-
0,60000,0.
|
62 |
-
0,61000,0.
|
63 |
-
0,62000,0.
|
64 |
-
0,63000,0.
|
65 |
-
0,64000,0.
|
66 |
-
0,65000,0.
|
67 |
-
0,66000,0.
|
68 |
-
0,67000,0.
|
69 |
-
0,68000,0.
|
70 |
-
0,69000,0.
|
71 |
-
0,70000,0.
|
72 |
-
0,71000,0.
|
73 |
-
0,72000,0.
|
74 |
-
0,73000,0.
|
75 |
-
0,74000,0.
|
76 |
-
0,75000,0.
|
77 |
-
0,76000,0.
|
78 |
-
0,77000,0.
|
79 |
-
0,78000,0.
|
80 |
-
0,79000,0.
|
81 |
-
0,80000,0.
|
82 |
-
0,81000,0.
|
83 |
-
0,82000,0.
|
84 |
-
0,83000,0.
|
85 |
-
0,84000,0.
|
86 |
-
0,85000,0.
|
87 |
-
0,86000,0.
|
88 |
-
0,87000,0.
|
89 |
-
0,88000,0.
|
90 |
-
0,89000,0.
|
91 |
-
0,90000,0.
|
92 |
-
0,91000,0.
|
93 |
-
0,92000,0.
|
94 |
-
0,93000,0.
|
95 |
-
0,94000,0.
|
96 |
-
0,95000,0.
|
97 |
-
0,96000,0.
|
98 |
-
0,97000,0.
|
99 |
-
0,98000,0.
|
100 |
-
0,99000,0.
|
101 |
-
0,100000,0.
|
102 |
-
0,101000,0.
|
103 |
-
0,102000,0.
|
104 |
-
0,103000,0.
|
105 |
-
0,104000,0.
|
106 |
-
0,105000,0.
|
107 |
-
0,106000,0.
|
108 |
-
0,107000,0.
|
109 |
-
0,108000,0.
|
110 |
-
0,109000,0.
|
111 |
-
0,110000,0.
|
112 |
-
0,111000,0.
|
113 |
-
0,112000,0.
|
114 |
-
0,113000,0.
|
115 |
-
0,114000,0.
|
116 |
-
0,115000,0.
|
117 |
-
0,116000,0.
|
118 |
-
0,117000,0.975,0.974
|
119 |
-
0,118000,0.976,0.974
|
120 |
-
0,119000,0.976,0.974
|
121 |
-
0,120000,0.976,0.974
|
122 |
-
0,121000,0.976,0.974
|
123 |
-
0,122000,0.976,0.974
|
124 |
-
0,123000,0.975,0.974
|
|
|
1 |
epoch,steps,src2trg,trg2src
|
2 |
+
0,1000,0.971,0.971
|
3 |
+
0,2000,0.971,0.971
|
4 |
+
0,3000,0.97,0.972
|
5 |
+
0,4000,0.971,0.972
|
6 |
+
0,5000,0.971,0.971
|
7 |
+
0,6000,0.971,0.972
|
8 |
+
0,7000,0.971,0.972
|
9 |
+
0,8000,0.971,0.973
|
10 |
+
0,9000,0.97,0.971
|
11 |
+
0,10000,0.97,0.97
|
12 |
+
0,11000,0.97,0.97
|
13 |
+
0,12000,0.97,0.971
|
14 |
+
0,13000,0.969,0.97
|
15 |
+
0,14000,0.97,0.971
|
16 |
+
0,15000,0.97,0.971
|
17 |
+
0,16000,0.97,0.971
|
18 |
+
0,17000,0.971,0.971
|
19 |
+
0,18000,0.971,0.971
|
20 |
+
0,19000,0.97,0.971
|
21 |
+
0,20000,0.97,0.971
|
22 |
+
0,21000,0.97,0.971
|
23 |
+
0,22000,0.97,0.971
|
24 |
+
0,23000,0.971,0.971
|
25 |
+
0,24000,0.97,0.971
|
26 |
+
0,25000,0.97,0.971
|
27 |
+
0,26000,0.97,0.97
|
28 |
+
0,27000,0.97,0.971
|
29 |
+
0,28000,0.97,0.971
|
30 |
+
0,29000,0.97,0.97
|
31 |
+
0,30000,0.969,0.971
|
32 |
+
0,31000,0.97,0.97
|
33 |
+
0,32000,0.97,0.97
|
34 |
+
0,33000,0.969,0.971
|
35 |
+
0,34000,0.97,0.971
|
36 |
+
0,35000,0.97,0.969
|
37 |
+
0,36000,0.97,0.971
|
38 |
+
0,37000,0.97,0.969
|
39 |
+
0,38000,0.97,0.971
|
40 |
+
0,39000,0.97,0.971
|
41 |
+
0,40000,0.971,0.971
|
42 |
+
0,41000,0.97,0.971
|
43 |
+
0,42000,0.97,0.971
|
44 |
+
0,43000,0.97,0.971
|
45 |
+
0,44000,0.971,0.971
|
46 |
+
0,45000,0.969,0.971
|
47 |
+
0,46000,0.969,0.972
|
48 |
+
0,47000,0.971,0.971
|
49 |
+
0,48000,0.971,0.971
|
50 |
+
0,49000,0.971,0.97
|
51 |
+
0,50000,0.97,0.971
|
52 |
+
0,51000,0.972,0.97
|
53 |
+
0,52000,0.97,0.971
|
54 |
+
0,53000,0.97,0.971
|
55 |
+
0,54000,0.97,0.971
|
56 |
+
0,55000,0.97,0.971
|
57 |
+
0,56000,0.97,0.971
|
58 |
+
0,57000,0.97,0.971
|
59 |
+
0,58000,0.969,0.971
|
60 |
+
0,59000,0.972,0.971
|
61 |
+
0,60000,0.971,0.97
|
62 |
+
0,61000,0.971,0.971
|
63 |
+
0,62000,0.971,0.971
|
64 |
+
0,63000,0.971,0.97
|
65 |
+
0,64000,0.971,0.971
|
66 |
+
0,65000,0.971,0.971
|
67 |
+
0,66000,0.971,0.971
|
68 |
+
0,67000,0.971,0.971
|
69 |
+
0,68000,0.97,0.971
|
70 |
+
0,69000,0.971,0.971
|
71 |
+
0,70000,0.971,0.971
|
72 |
+
0,71000,0.971,0.97
|
73 |
+
0,72000,0.971,0.971
|
74 |
+
0,73000,0.971,0.971
|
75 |
+
0,74000,0.97,0.971
|
76 |
+
0,75000,0.971,0.971
|
77 |
+
0,76000,0.971,0.97
|
78 |
+
0,77000,0.971,0.971
|
79 |
+
0,78000,0.971,0.971
|
80 |
+
0,79000,0.971,0.97
|
81 |
+
0,80000,0.97,0.971
|
82 |
+
0,81000,0.97,0.971
|
83 |
+
0,82000,0.971,0.97
|
84 |
+
0,83000,0.971,0.97
|
85 |
+
0,84000,0.971,0.97
|
86 |
+
0,85000,0.97,0.97
|
87 |
+
0,86000,0.971,0.971
|
88 |
+
0,87000,0.971,0.971
|
89 |
+
0,88000,0.971,0.971
|
90 |
+
0,89000,0.971,0.97
|
91 |
+
0,90000,0.97,0.97
|
92 |
+
0,91000,0.971,0.971
|
93 |
+
0,92000,0.97,0.97
|
94 |
+
0,93000,0.97,0.97
|
95 |
+
0,94000,0.97,0.971
|
96 |
+
0,95000,0.97,0.97
|
97 |
+
0,96000,0.97,0.97
|
98 |
+
0,97000,0.97,0.971
|
99 |
+
0,98000,0.97,0.971
|
100 |
+
0,99000,0.971,0.971
|
101 |
+
0,100000,0.97,0.971
|
102 |
+
0,101000,0.971,0.971
|
103 |
+
0,102000,0.97,0.971
|
104 |
+
0,103000,0.97,0.97
|
105 |
+
0,104000,0.97,0.971
|
106 |
+
0,105000,0.97,0.971
|
107 |
+
0,106000,0.971,0.97
|
108 |
+
0,107000,0.97,0.97
|
109 |
+
0,108000,0.971,0.971
|
110 |
+
0,109000,0.971,0.97
|
111 |
+
0,110000,0.971,0.97
|
112 |
+
0,111000,0.971,0.971
|
113 |
+
0,112000,0.971,0.971
|
114 |
+
0,113000,0.971,0.971
|
115 |
+
0,114000,0.971,0.97
|
116 |
+
0,115000,0.97,0.969
|
117 |
+
0,116000,0.97,0.97
|
|
|
|
|
|
|
|
|
|
|
|
|
|
eval/translation_evaluation_Tatoeba-eng-swe-dev.tsv.gz_results.csv
CHANGED
@@ -1,124 +1,117 @@
|
|
1 |
epoch,steps,src2trg,trg2src
|
2 |
-
0,1000,0.97,0.
|
3 |
-
0,2000,0.
|
4 |
-
0,3000,0.
|
5 |
-
0,4000,0.
|
6 |
-
0,5000,0.
|
7 |
-
0,6000,0.
|
8 |
-
0,7000,0.
|
9 |
-
0,8000,0.
|
10 |
-
0,9000,0.
|
11 |
-
0,10000,0.
|
12 |
-
0,11000,0.
|
13 |
-
0,12000,0.
|
14 |
-
0,13000,0.
|
15 |
-
0,14000,0.
|
16 |
-
0,15000,0.
|
17 |
-
0,16000,0.
|
18 |
-
0,17000,0.
|
19 |
-
0,18000,0.967,0.
|
20 |
-
0,19000,0.
|
21 |
-
0,20000,0.
|
22 |
-
0,21000,0.
|
23 |
-
0,22000,0.967,0.
|
24 |
-
0,23000,0.
|
25 |
-
0,24000,0.
|
26 |
-
0,25000,0.
|
27 |
-
0,26000,0.
|
28 |
-
0,27000,0.
|
29 |
-
0,28000,0.
|
30 |
-
0,29000,0.
|
31 |
-
0,30000,0.
|
32 |
-
0,31000,0.
|
33 |
-
0,32000,0.967,0.
|
34 |
-
0,33000,0.
|
35 |
-
0,34000,0.
|
36 |
-
0,35000,0.968,0.
|
37 |
-
0,36000,0.
|
38 |
-
0,37000,0.
|
39 |
-
0,38000,0.968,0.
|
40 |
-
0,39000,0.
|
41 |
-
0,40000,0.
|
42 |
-
0,41000,0.
|
43 |
-
0,42000,0.
|
44 |
-
0,43000,0.
|
45 |
-
0,44000,0.
|
46 |
-
0,45000,0.
|
47 |
-
0,46000,0.
|
48 |
-
0,47000,0.
|
49 |
-
0,48000,0.
|
50 |
-
0,49000,0.
|
51 |
-
0,50000,0.966,0.
|
52 |
-
0,51000,0.967,0.
|
53 |
-
0,52000,0.
|
54 |
-
0,53000,0.
|
55 |
-
0,54000,0.
|
56 |
-
0,55000,0.
|
57 |
-
0,56000,0.
|
58 |
-
0,57000,0.968,0.
|
59 |
-
0,58000,0.
|
60 |
-
0,59000,0.
|
61 |
-
0,60000,0.97,0.
|
62 |
-
0,61000,0.
|
63 |
-
0,62000,0.
|
64 |
-
0,63000,0.
|
65 |
-
0,64000,0.
|
66 |
-
0,65000,0.
|
67 |
-
0,66000,0.968,0.
|
68 |
-
0,67000,0.
|
69 |
-
0,68000,0.
|
70 |
-
0,69000,0.
|
71 |
-
0,70000,0.
|
72 |
-
0,71000,0.
|
73 |
-
0,72000,0.
|
74 |
-
0,73000,0.
|
75 |
-
0,74000,0.
|
76 |
-
0,75000,0.
|
77 |
-
0,76000,0.
|
78 |
-
0,77000,0.
|
79 |
-
0,78000,0.
|
80 |
-
0,79000,0.
|
81 |
-
0,80000,0.
|
82 |
-
0,81000,0.
|
83 |
-
0,82000,0.
|
84 |
-
0,83000,0.
|
85 |
-
0,84000,0.
|
86 |
-
0,85000,0.
|
87 |
-
0,86000,0.
|
88 |
-
0,87000,0.
|
89 |
-
0,88000,0.
|
90 |
-
0,89000,0.967,0.
|
91 |
-
0,90000,0.
|
92 |
-
0,91000,0.967,0.
|
93 |
-
0,92000,0.
|
94 |
-
0,93000,0.
|
95 |
-
0,94000,0.
|
96 |
-
0,95000,0.967,0.
|
97 |
-
0,96000,0.
|
98 |
-
0,97000,0.
|
99 |
-
0,98000,0.969,0.
|
100 |
-
0,99000,0.
|
101 |
-
0,100000,0.
|
102 |
-
0,101000,0.
|
103 |
-
0,102000,0.
|
104 |
-
0,103000,0.
|
105 |
-
0,104000,0.
|
106 |
-
0,105000,0.
|
107 |
-
0,106000,0.
|
108 |
-
0,107000,0.
|
109 |
-
0,108000,0.
|
110 |
-
0,109000,0.
|
111 |
-
0,110000,0.
|
112 |
-
0,111000,0.
|
113 |
-
0,112000,0.
|
114 |
-
0,113000,0.
|
115 |
-
0,114000,0.
|
116 |
-
0,115000,0.966,0.
|
117 |
-
0,116000,0.
|
118 |
-
0,117000,0.966,0.97
|
119 |
-
0,118000,0.966,0.969
|
120 |
-
0,119000,0.964,0.969
|
121 |
-
0,120000,0.966,0.969
|
122 |
-
0,121000,0.966,0.97
|
123 |
-
0,122000,0.967,0.97
|
124 |
-
0,123000,0.966,0.97
|
|
|
1 |
epoch,steps,src2trg,trg2src
|
2 |
+
0,1000,0.97,0.967
|
3 |
+
0,2000,0.97,0.968
|
4 |
+
0,3000,0.97,0.967
|
5 |
+
0,4000,0.97,0.968
|
6 |
+
0,5000,0.971,0.967
|
7 |
+
0,6000,0.97,0.966
|
8 |
+
0,7000,0.971,0.968
|
9 |
+
0,8000,0.97,0.968
|
10 |
+
0,9000,0.971,0.966
|
11 |
+
0,10000,0.969,0.968
|
12 |
+
0,11000,0.97,0.967
|
13 |
+
0,12000,0.967,0.967
|
14 |
+
0,13000,0.969,0.965
|
15 |
+
0,14000,0.971,0.966
|
16 |
+
0,15000,0.971,0.967
|
17 |
+
0,16000,0.97,0.965
|
18 |
+
0,17000,0.969,0.965
|
19 |
+
0,18000,0.967,0.967
|
20 |
+
0,19000,0.969,0.966
|
21 |
+
0,20000,0.968,0.967
|
22 |
+
0,21000,0.968,0.966
|
23 |
+
0,22000,0.967,0.965
|
24 |
+
0,23000,0.97,0.966
|
25 |
+
0,24000,0.969,0.967
|
26 |
+
0,25000,0.967,0.967
|
27 |
+
0,26000,0.968,0.968
|
28 |
+
0,27000,0.967,0.965
|
29 |
+
0,28000,0.968,0.966
|
30 |
+
0,29000,0.967,0.967
|
31 |
+
0,30000,0.968,0.968
|
32 |
+
0,31000,0.966,0.967
|
33 |
+
0,32000,0.967,0.967
|
34 |
+
0,33000,0.969,0.967
|
35 |
+
0,34000,0.969,0.967
|
36 |
+
0,35000,0.968,0.967
|
37 |
+
0,36000,0.969,0.967
|
38 |
+
0,37000,0.969,0.967
|
39 |
+
0,38000,0.968,0.968
|
40 |
+
0,39000,0.969,0.966
|
41 |
+
0,40000,0.965,0.968
|
42 |
+
0,41000,0.968,0.966
|
43 |
+
0,42000,0.968,0.969
|
44 |
+
0,43000,0.968,0.969
|
45 |
+
0,44000,0.969,0.969
|
46 |
+
0,45000,0.966,0.967
|
47 |
+
0,46000,0.966,0.968
|
48 |
+
0,47000,0.968,0.967
|
49 |
+
0,48000,0.966,0.968
|
50 |
+
0,49000,0.967,0.968
|
51 |
+
0,50000,0.966,0.969
|
52 |
+
0,51000,0.967,0.967
|
53 |
+
0,52000,0.967,0.968
|
54 |
+
0,53000,0.968,0.968
|
55 |
+
0,54000,0.968,0.966
|
56 |
+
0,55000,0.968,0.967
|
57 |
+
0,56000,0.968,0.968
|
58 |
+
0,57000,0.968,0.967
|
59 |
+
0,58000,0.969,0.967
|
60 |
+
0,59000,0.969,0.967
|
61 |
+
0,60000,0.97,0.966
|
62 |
+
0,61000,0.969,0.968
|
63 |
+
0,62000,0.968,0.97
|
64 |
+
0,63000,0.969,0.969
|
65 |
+
0,64000,0.969,0.968
|
66 |
+
0,65000,0.969,0.969
|
67 |
+
0,66000,0.968,0.965
|
68 |
+
0,67000,0.969,0.968
|
69 |
+
0,68000,0.968,0.967
|
70 |
+
0,69000,0.968,0.968
|
71 |
+
0,70000,0.97,0.968
|
72 |
+
0,71000,0.97,0.969
|
73 |
+
0,72000,0.967,0.968
|
74 |
+
0,73000,0.967,0.968
|
75 |
+
0,74000,0.968,0.967
|
76 |
+
0,75000,0.969,0.965
|
77 |
+
0,76000,0.969,0.968
|
78 |
+
0,77000,0.968,0.965
|
79 |
+
0,78000,0.968,0.966
|
80 |
+
0,79000,0.97,0.966
|
81 |
+
0,80000,0.969,0.968
|
82 |
+
0,81000,0.969,0.967
|
83 |
+
0,82000,0.968,0.967
|
84 |
+
0,83000,0.969,0.967
|
85 |
+
0,84000,0.968,0.967
|
86 |
+
0,85000,0.968,0.969
|
87 |
+
0,86000,0.97,0.966
|
88 |
+
0,87000,0.968,0.969
|
89 |
+
0,88000,0.967,0.97
|
90 |
+
0,89000,0.967,0.967
|
91 |
+
0,90000,0.967,0.968
|
92 |
+
0,91000,0.967,0.967
|
93 |
+
0,92000,0.968,0.967
|
94 |
+
0,93000,0.971,0.966
|
95 |
+
0,94000,0.97,0.968
|
96 |
+
0,95000,0.967,0.964
|
97 |
+
0,96000,0.967,0.966
|
98 |
+
0,97000,0.969,0.964
|
99 |
+
0,98000,0.969,0.966
|
100 |
+
0,99000,0.969,0.967
|
101 |
+
0,100000,0.968,0.967
|
102 |
+
0,101000,0.967,0.966
|
103 |
+
0,102000,0.967,0.967
|
104 |
+
0,103000,0.967,0.967
|
105 |
+
0,104000,0.967,0.966
|
106 |
+
0,105000,0.966,0.967
|
107 |
+
0,106000,0.968,0.968
|
108 |
+
0,107000,0.968,0.966
|
109 |
+
0,108000,0.967,0.968
|
110 |
+
0,109000,0.967,0.967
|
111 |
+
0,110000,0.968,0.969
|
112 |
+
0,111000,0.967,0.968
|
113 |
+
0,112000,0.967,0.968
|
114 |
+
0,113000,0.967,0.966
|
115 |
+
0,114000,0.966,0.966
|
116 |
+
0,115000,0.966,0.967
|
117 |
+
0,116000,0.966,0.966
|
|
|
|
|
|
|
|
|
|
|
|
|
|
pytorch_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 498834989
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6bea01d07d85dc7870dfd9907ca78bac21ee8b1ef81c838f25a88b6494c79b32
|
3 |
size 498834989
|
tokenizer_config.json
CHANGED
@@ -4,7 +4,7 @@
|
|
4 |
"do_lower_case": false,
|
5 |
"mask_token": "[MASK]",
|
6 |
"model_max_length": 1000000000000000019884624838656,
|
7 |
-
"name_or_path": "output/
|
8 |
"never_split": null,
|
9 |
"pad_token": "[PAD]",
|
10 |
"sep_token": "[SEP]",
|
|
|
4 |
"do_lower_case": false,
|
5 |
"mask_token": "[MASK]",
|
6 |
"model_max_length": 1000000000000000019884624838656,
|
7 |
+
"name_or_path": "output/no-normalize-en-sv-2022-12-26_18-39-42/",
|
8 |
"never_split": null,
|
9 |
"pad_token": "[PAD]",
|
10 |
"sep_token": "[SEP]",
|