README.md · BSC-LT/roberta_model_for_anonimization at 315ac325dc300b9b16d9462b38d53df4e057c189

metadata

license: mit
language:
  - es
  - ca
metrics:
  - f1
  - precision
  - recall
pipeline_tag: token-classification
widget:
  - text: Me llamo Alex y vivo en Barcelona

This is a Roberta multilingual (Catalan & Spanish) anonimization model, for use with BSC's AnonymizationPipeline at:

https://github.com/TeMU-BSC/AnonymizationPipeline.

The anonymization pipeline is a library for performing sensitive data identification and ultimately anonymization of the detected data in Spanish and Catalan user generated plain text.

This is model can be used as a standalone model but it is meant to work within the pipeline.

The Roberta model can detect the following entities: ORG, PER, LOC

Type	Score
`ENTS_F`	90.03
`ENTS_P`	89.7
`ENTS_R`	90.3