Zero-Shot Classification
Transformers
PyTorch
Safetensors
bert
text-classification
Inference Endpoints
saattrupdan commited on
Commit
beafdfd
·
1 Parent(s): 44ff45c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -0
README.md ADDED
@@ -0,0 +1,78 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: zero-shot-classification
3
+ language:
4
+ - da
5
+ - no
6
+ - nb
7
+ - sv
8
+ license: mit
9
+ datasets:
10
+ - strombergnlp/danfever
11
+ - KBLab/overlim
12
+ - MoritzLaurer/multilingual-NLI-26lang-2mil7
13
+ model-index:
14
+ - name: nb-bert-base-ner-scandi
15
+ results: []
16
+ widget:
17
+ - example_title: Danish
18
+ text: Mexicansk bokser advarer Messi - 'Du skal bede til gud, om at jeg ikke finder dig'
19
+ candidate_labels: sundhed, politik, sport, religion
20
+ - example_title: Norwegian
21
+ text: Regjeringen i Russland hevder Norge fører en politikk som vil føre til opptrapping i Arktis og «den endelige ødeleggelsen av russisk-norske relasjoner».
22
+ candidate_labels: helse, politikk, sport, religion
23
+ - example_title: Swedish
24
+ text: Så luras kroppens immunförsvar att bota cancer
25
+ candidate_labels: hälsa, politik, sport, religion
26
+ inference:
27
+ parameters:
28
+ hypothesis_template: "Dette eksempel handler om {}"
29
+ ---
30
+
31
+ # ScandiNLI - Natural Language Inference model for Scandinavian Languages
32
+
33
+ This model is a fine-tuned version of [NbAiLab/nb-bert-base](https://huggingface.co/NbAiLab/nb-bert-base) for Natural Language Inference in Danish, Norwegian Bokmål and Swedish.
34
+
35
+ It has been fine-tuned on a dataset composed of [DanFEVER](https://aclanthology.org/2021.nodalida-main.pdf#page=439) as well as machine translated versions of [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) and [CommitmentBank](https://doi.org/10.18148/sub/2019.v23i2.601) into all three languages, and machine translated versions of [FEVER](https://aclanthology.org/N18-1074/) and [Adversarial NLI](https://aclanthology.org/2020.acl-main.441/) into Swedish.
36
+
37
+ The three languages are sampled equally during training, and they're validated on validation splits of [DanFEVER](https://aclanthology.org/2021.nodalida-main.pdf#page=439) and machine translated versions of [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) for Swedish and Norwegian Bokmål, sampled equally.
38
+
39
+
40
+ ## Quick start
41
+
42
+ You can use this model in your scripts as follows:
43
+
44
+ ```python
45
+ >>> from transformers import pipeline
46
+ >>> classifier = pipeline(
47
+ ... "zero-shot-classification",
48
+ ... model="alexandrainst/nb-bert-base-nli-scandi",
49
+ ... )
50
+ >>> classifier(
51
+ ... "Mexicansk bokser advarer Messi - 'Du skal bede til gud, om at jeg ikke finder dig'",
52
+ ... candidate_labels=['sundhed', 'politik', 'sport', 'religion'],
53
+ ... hypothesis_template="Dette eksempel handler om {}",
54
+ ... )
55
+ {'sequence': "Mexicansk bokser advarer Messi - 'Du skal bede til gud, om at jeg ikke finder dig'",
56
+ 'labels': ['sport', 'religion', 'politik', 'sundhed'],
57
+ 'scores': [0.6134647727012634,
58
+ 0.30309760570526123,
59
+ 0.05021871626377106,
60
+ 0.03321893885731697]}
61
+ ```
62
+
63
+
64
+ ## Training procedure
65
+
66
+ ### Training hyperparameters
67
+
68
+ The following hyperparameters were used during training:
69
+ - learning_rate: 2e-05
70
+ - train_batch_size: 8
71
+ - eval_batch_size: 8
72
+ - seed: 4242
73
+ - gradient_accumulation_steps: 2
74
+ - total_train_batch_size: 32
75
+ - optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
76
+ - lr_scheduler_type: linear
77
+ - lr_scheduler_warmup_steps: 500
78
+ - max_steps: 50,000