--- language: - es - qu tags: - quechua - translation - spanish license: apache-2.0 metrics: - bleu - sacrebleu widget: - text: "Dios ama a los hombres" - text: "A pesar de todo, soy feliz" - text: "¿Qué harán allí?" - text: "Debes aprender a respetar" --- # Spanish to Quechua translator This model is a finetuned version of the [t5-small](https://huggingface.co/t5-small). ## Model description t5-small-finetuned-spanish-to-quechua has trained for 46 epochs with 102 747 sentences, the validation was performed with 12 844 sentences and 12 843 sentences were used for the test. ## Intended uses & limitations A large part of the dataset has been extracted from biblical texts, which makes the model perform better with certain types of sentences. ### How to use You can import this model as follows: ```python >>> from transformers import AutoModelForSeq2SeqLM, AutoTokenizer >>> model_name = 'hackathon-pln-es/t5-small-finetuned-spanish-to-quechua' >>> model = AutoModelForSeq2SeqLM.from_pretrained(model_name) >>> tokenizer = AutoTokenizer.from_pretrained(model_name) ``` To translate you can do: ```python >>> sentence = "Entonces dijo" >>> input = tokenizer(sentence, return_tensors="pt") >>> output = model.generate(input["input_ids"], max_length=40, num_beams=4, early_stopping=True) >>> print('Original Sentence: {} \nTranslated sentence: {}'.format(sentence, tokenizer.decode(output[0]))) ``` ### Limitations and bias Actually this model only can translate to Quechua of Ayacucho. ## Training data For train this model we use [Spanish to Quechua dataset](https://huggingface.co/datasets/hackathon-pln-es/spanish-to-quechua) ## Evaluation results We obtained the following metrics during the training process: - eval_bleu = 2.9691 - eval_loss = 1.2064628601074219 ## Team members - [Sara Benel](https://huggingface.co/sbenel) - [Jose Vílchez](https://huggingface.co/JCarlos)