khalidalt commited on
Commit
7511e4d
1 Parent(s): 730ec00

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +91 -0
README.md ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+
3
+ language:
4
+
5
+ - en
6
+
7
+ tags:
8
+
9
+ - text-classification
10
+
11
+ - zero-shot-classification
12
+
13
+ metrics:
14
+
15
+ - accuracy
16
+
17
+ widget:
18
+
19
+ - text: "I first thought that I liked the movie, but upon second thought it was actually disappointing. [SEP] The movie was good."
20
+
21
+ ---
22
+
23
+ # DeBERTa-v3-large-mnli-fever-anli
24
+
25
+ ## Model description
26
+
27
+ This model was trained on the Multi-Genre Natural Language Inference ( MultiNLI ) dataset, which consists of 433k sentence pairs textual entailment information.
28
+
29
+ The model used is [DeBERTa-v3-large from Microsoft](https://huggingface.co/microsoft/deberta-large). The v3 DeBERTa outperforms the result of Bert and RoBERTa in majority of NLU benchmarks by using disentangled attention and enhanced mask decoder. More information about the orginal model is on [official repository](https://github.com/microsoft/DeBERTa) and the [paper](https://arxiv.org/abs/2006.03654)
30
+
31
+ ## Intended uses & limitations
32
+
33
+ #### How to use the model
34
+
35
+ ```python
36
+
37
+ premise = "The Movie have been criticized for the story. However, I think it is a great movie."
38
+
39
+ hypothesis = "I liked the movie"
40
+
41
+ input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
42
+
43
+ output = model(input["input_ids"].to(device)) # device = "cuda:0" or "cpu"
44
+
45
+ prediction = torch.softmax(output["logits"][0], -1)
46
+
47
+ label_names = ["entailment", "neutral", "contradiction"]
48
+
49
+ print(label_names[prediction.argmax(0).tolist()])
50
+
51
+ ```
52
+
53
+ ### Training data
54
+
55
+ This model was trained on the MultiNLI dataset, which consists of 433k sentence textual entitlement.
56
+
57
+ ### Training procedure
58
+
59
+ DeBERTa-v3-large-mnli was trained using the Hugging Face trainer with the following hyperparameters.
60
+
61
+ ```
62
+
63
+ train_args = TrainingArguments(
64
+ learning_rate=2e-5,
65
+
66
+ per_device_train_batch_size=8,
67
+
68
+ per_device_eval_batch_size=8,
69
+
70
+ num_train_epochs=3,
71
+
72
+ warmup_ratio=0.06,
73
+
74
+ weight_decay=0.1,
75
+
76
+ fp16=True,
77
+
78
+ seed=42,
79
+ )
80
+
81
+
82
+
83
+ ```
84
+
85
+ ## Limitations and bias
86
+
87
+ Please consult the original DeBERTa paper and literature on different NLI datasets for potential biases.
88
+
89
+ ### BibTeX entry and citation info
90
+
91
+ Please cite the [DeBERTa paper](https://arxiv.org/abs/2006.03654) and [MultiNLI Dataset](https://cims.nyu.edu/~sbowman/multinli/paper.pdf) if you use this model and include the Huggingface hub.