erst
/

xlm-roberta-base-finetuned-nace

Text Classification

Inference Endpoints

Model card Files Files and versions Community

xlm-roberta-base-finetuned-nace / README.md

CasperEriksen's picture

Update README.md

7aa08f7 about 3 years ago

|

raw history blame

No virus

1.01 kB

	# Classifying Text into NACE Codes

	This model is [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) fine-tuned to classify descriptions of activities into [NACE Rev. 2](https://ec.europa.eu/eurostat/web/nace-rev2) codes.


	## Data
	The data used to fine-tune the model consist of 2.5 million descriptions of activities from Norwegian and Danish businesses. To improve the model's multilingual performance, random samples of the Norwegian and Danish descriptions were machine translated into the following languages:
	- English
	- German
	- Spanish
	- French
	- Finnish


	## Quick Start

	```python
	from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification

	tokenizer = AutoTokenizer.from_pretrained("erst/xlm-roberta-base-finetuned-nace")
	model = AutoModelForSequenceClassification.from_pretrained("erst/xlm-roberta-base-finetuned-nace")

	pl = pipeline(
	"sentiment-analysis",
	model=model,
	tokenizer=tokenizer,
	return_all_scores=False,
	)

	pl("We sell clothes")
	```