imdbo commited on
Commit
fa41308
1 Parent(s): aaf356b

Update README_English.md

Browse files
Files changed (1) hide show
  1. README_English.md +3 -3
README_English.md CHANGED
@@ -25,7 +25,7 @@ Model developed with OpenNMT for the Galician-Spanish pair using the transformer
25
  + Translate an input_text using the NOS-MT-gl-es model with the following command:
26
 
27
  ```bash
28
- onmt_translate -src input_text -model NOS-MT-es-gl -output ./output_file.txt -replace_unk -phrase_table phrase_table-gl-es.txt -gpu 0
29
  ```
30
  + The resulting translation will be in the PATH indicated by the -output flag.
31
 
@@ -40,7 +40,7 @@ Authentic corpora are corpora produced by human translators. Synthetic corpora a
40
 
41
  + Tokenisation was performed with a modified version of the [linguakit](https://github.com/citiususc/Linguakit) tokeniser (tokenizer.pl) that does not append a new line after each token.
42
  + All BPE models were generated with the script [learn_bpe.py](https://github.com/OpenNMT/OpenNMT-py/blob/master/tools/learn_bpe.py)
43
- + Using the .yaml in this repository it is possible to replicate the original training process. Before training the model, please verify that the path to each target (tgt) and (src) file is correct. Once this is done, proceed as follows:
44
 
45
  ```bash
46
  onmt_build_vocab -config bpe-gl-es_emb.yaml -n_sample 100000
@@ -53,7 +53,7 @@ You may find the parameters used for this model inside the file bpe-gl-es_emb.y
53
 
54
  **Evaluation**
55
 
56
- The BLEU evaluation of the models is made with a mixture of internally developed tests (gold1, gold2, test-suite) and other datasets available in Galician (Flores).
57
 
58
  | GOLD 1 | GOLD 2 | FLORES | TEST-SUITE|
59
  | ------------- |:-------------:| -------:|----------:|
 
25
  + Translate an input_text using the NOS-MT-gl-es model with the following command:
26
 
27
  ```bash
28
+ onmt_translate -src input_text -model NOS-MT-gl-es -output ./output_file.txt -replace_unk -phrase_table phrase_table-gl-es.txt -gpu 0
29
  ```
30
  + The resulting translation will be in the PATH indicated by the -output flag.
31
 
 
40
 
41
  + Tokenisation was performed with a modified version of the [linguakit](https://github.com/citiususc/Linguakit) tokeniser (tokenizer.pl) that does not append a new line after each token.
42
  + All BPE models were generated with the script [learn_bpe.py](https://github.com/OpenNMT/OpenNMT-py/blob/master/tools/learn_bpe.py)
43
+ + Using the .yaml in this repository, it is possible to replicate the original training process. Before training the model, please verify that the path to each target (tgt) and (src) file is correct. Once this is done, proceed as follows:
44
 
45
  ```bash
46
  onmt_build_vocab -config bpe-gl-es_emb.yaml -n_sample 100000
 
53
 
54
  **Evaluation**
55
 
56
+ The BLEU evaluation of the models is a mixture of internally developed tests (gold1, gold2, test-suite) and other datasets available in Galician (Flores).
57
 
58
  | GOLD 1 | GOLD 2 | FLORES | TEST-SUITE|
59
  | ------------- |:-------------:| -------:|----------:|