--- license: mit language: - fr library_name: flair tags: - legal --- This is a version of the flair/ner-french model fine-tuned with a corpora of 127 case reports from the European Court of Human Rights (ECHR) in French that were built and annotated for anonymization as part of the work presented in the Master's thesis "Automatic anonymization of legal texts from the European Court of Human Rights: building four corpora of case reports in French and Spanish language for anonymization". The annotation was carried out by projecting the annotations of the English corpus built by Pilán et al. (2022). It predicts 8 tags: DATETIME, CODE, PER, DEM, MISC, ORG, LOC, QUANTITY. The corpus and the code used for fine-tuning this model are available on GitHub: https://github.com/msierrofer/automatic-anonymization-ECHR-French-Spanish/tree/full-corpora-(127-texts). References Pilán, I., Lison, P., Ovrelid, L., Papadopoulou, A., Sánchez, D. & Batet, M. (2022). The Text Anonymization Benchmark (TAB): A Dedicated Corpus and Evaluation Framework for Text Anonymization. In Computational Linguistics, 48(4), pp. 1053–1101. Cambridge, MA: MIT Press. doi: 10.1162/coli_a_00458.