File size: 659 Bytes
3341f1c
acb3a53
3341f1c
 
 
 
 
 
 
 
 
 
 
 
 
f14ca9e
3341f1c
 
 
522a4c1
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
---
license: mit
datasets:
- SetFit/enron_spam
metrics:
- accuracy
library_name: transformers
pipeline_tag: text-classification
tags:
- email
- multilingual
---

# XLM-RoBERTa for multilingual spam detection

I trained this model to detect spam in german as there is no german labeled spam mail dataset, and I could not find an already pretrained multilingual model for the enron spam dataset.

## Intended use
Identifying spam mail in any XLM-RoBERTa-supported language.
Note that there was no thorough testing on it's intended use - only validation on the enron mail dataset.

## Evaluation

Eval on test set of enron spam:

- loss: 0.0315
- accuracy: 0.996