Web as a corpus, Large Language Models, Machine Translation, Language Technologies, Natural Language Processing

Our project name, HPLT, is an acronym for High Performance Language Technologies. We are aiming high at combining large quantities of data, a number of languages and high-performance computing to build powerful and efficient language and translation models. Another goal of this project is to publish the results of this project in a shared space with open licenses.