license: mit
Model Description
Model created with OpenNMT-py 3.2 for the Spanish-Asturian pair using a transformer architecture. The model was converted to the ctranslate2 format. This model was trained for the paper Training and fine-tuning NMT models for low-resource languages using Apertium-based synthetic corpora
How to Translate with this Model
- Install Python 3.9
- Install ctranslate 3.2
- Translate an input_text using the NOS-MT-es-ast model with the following command:
perl tokenizer.perl < input.txt > input.tok
subword_nmt.apply_bpe -c ./bpe/es.bpe < input.tok > input.bpe
python3 translate.py ./ct2-ast input.bpe > output.txt
sed -i 's/@@ //g' output.txt
Funding
This model was developed within the Nós Project, funded by the Ministerio para la Transformación Digital y de la Función Pública - Funded by EU – NextGenerationEU within the framework of the [project ILENIA] (https://proyectoilenia.es/) with reference 2022/TL22/00215336.
Citation
If you use this model in your research, please cite the following paper: Sant, A., Bardanca Outeiriño, D., Pichel Campos, J. R., De Luca Fornaciari, F., Escolano, C., García Gilabert, J., Gamallo Otero, P., Mash, A., Liao, X., & Melero, M. (2023). Training and fine-tuning NMT models for low-resource languages using Apertium-based synthetic corpora. arXiv.