File size: 909 Bytes
e41d125
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# Text Classification into DB07 codes

This model is a fine-tuned xlm-roberta-base. The model is fine-tuned to classify Danish descriptions of acitivities into Dansk Branchekode DB07 codes.


## Data
Approximately 2.5 million descriptions of acitivities written by Norwegian and Danish businesses were used to fine-tune the model. The Norwegian descriptions were translated into Danish and the Norwegian SN 2007 codes were translated into Danish DB07 codes.

## Quick Start

```python
from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("CasperEriksen/xlm-roberta-base-finetuned-db07")
model = AutoModelForSequenceClassification.from_pretrained("CasperEriksen/xlm-roberta-base-finetuned-db07")

pl = pipeline(
    "sentiment-analysis",
    model=model,
    tokenizer=tokenizer,
    return_all_scores=False,
)

pl("Salg af tøj")
```