--- license: mit --- . # Spanish truecasing model This is a Spanish truecasing-model that works with the Dalton Fury Python project: https://github.com/daltonfury42/truecase You can install it here: https://pypi.org/project/truecase/ ## Quick start To use the Spanish model use the TrueCase.py file uploaded to this repository https://huggingface.co/HURIDOCS/spanish_truecasing/blob/main/TrueCaser.py Install the requirements: pip install nltk And ready to work: from TrueCaser import TrueCaser model_path = "spanish.dist" spanish_truecasing = TrueCaser(model_path) text = 'informe no.78/08. peticiĆ³n 785-05 admisibilidad. vicente arturo villanueva ortega y otros.' print(spanish_truecasing.get_true_case(text)) ## Notes The model was trained with the Europarl dataset that contains transcriptions of the European Parliament discusions: https://www.statmt.org/europarl/ Europarl: A Parallel Corpus for Statistical Machine Translation, Philipp Koehn, MT Summit 2005 Using huggingface load_dataset: europarl = load_dataset('large_spanish_corpus', name='Europarl')