--- language: - mt datasets: - MLRS/korpus_malti model-index: - name: BERTu results: - task: type: dependency-parsing name: Dependency Parsing dataset: type: universal_dependencies args: mt_mudt name: Maltese Universal Dependencies Treebank (MUDT) metrics: - type: uas value: 92.31 name: Unlabelled Attachment Score - type: las value: 88.14 name: Labelled Attachment Score - task: type: part-of-speech-tagging name: Part-of-Speech Tagging dataset: type: mlrs_pos name: MLRS POS dataset metrics: - type: accuracy value: 98.58 name: UPOS Accuracy args: upos - type: accuracy value: 98.54 name: XPOS Accuracy args: xpos - task: type: named-entity-recognition name: Named Entity Recognition dataset: type: wikiann name: WikiAnn (Maltese) args: mt metrics: - type: f1 args: span value: 86.77 name: Span-based F1 - task: type: sentiment-analysis name: Sentiment Analysis dataset: type: mt-sentiment-analysis name: Maltese Sentiment Analysis Dataset metrics: - type: f1 args: macro value: 78.96 name: Macro-averaged F1 license: cc-by-nc-sa-4.0 widget: - text: "Malta hija gżira fil-[MASK]." --- # BERTu A Maltese monolingual model pre-trained from scratch on the Korpus Malti v4.0 using the BERT (base) architecture. ## License This work is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License][cc-by-nc-sa]. Permissions beyond the scope of this license may be available at [https://mlrs.research.um.edu.mt/](https://mlrs.research.um.edu.mt/). [![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa] [cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/ [cc-by-nc-sa-image]: https://licensebuttons.net/l/by-nc-sa/4.0/88x31.png ## Citation This work was first presented in [Pre-training Data Quality and Quantity for a Low-Resource Language: New Corpus and BERT Models for Maltese](https://aclanthology.org/2022.deeplo-1.10/). Cite it as follows: ```bibtex @inproceedings{BERTu, title = "Pre-training Data Quality and Quantity for a Low-Resource Language: New Corpus and {BERT} Models for {M}altese", author = "Micallef, Kurt and Gatt, Albert and Tanti, Marc and van der Plas, Lonneke and Borg, Claudia", booktitle = "Proceedings of the Third Workshop on Deep Learning for Low-Resource Natural Language Processing", month = jul, year = "2022", address = "Hybrid", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.deeplo-1.10", doi = "10.18653/v1/2022.deeplo-1.10", pages = "90--101", } ```