metadata

license: openrail
datasets:
  - SetFit/enron_spam
metrics:
  - accuracy
library_name: transformers
pipeline_tag: text-classification
tags:
  - email
  - multilingual

XLM-RoBERTa for multilingual spam detection

I trained this model to detect spam in german as there is no german labeled spam mail dataset, and I could not find an already pretrained model for this dataset.

Intended use

Identifying spam mail in any XLM-RoBERTa-supported language. Note that there was no thorough testing on it's intended use - only validation on the enron mail dataset.