File size: 3,160 Bytes
7d3476f
 
 
 
 
 
 
89cbb69
7d3476f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
{}
---
language: en
license: cc-by-4.0
tags:
- text-classification
repo: https://huggingface.co/booyu/DeBERTa-v3-large_finetune
---
# Model Card for j72446cx-n35081bw-NLI
<!-- Provide a quick summary of what the model is/does. -->
This is a pair classification model that was trained to
      determine whether the given “hypothesis” logically follows from the “premise.
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
This model is based upon a DeBERTa-v3 model that was fine-tuned
      on 27K pairs of texts.
- **Developed by:** Boyu Wei and Changyi Xin
- **Language(s):** English
- **Model type:** Supervised
- **Model architecture:** Transformers
- **Finetuned from model [optional]:** DeBERTa-v3-large
### Model Resources
<!-- Provide links where applicable. -->
- **Repository:** https://huggingface.co/microsoft/deberta-v3-large
- **Paper or documentation:** https://arxiv.org/abs/2111.09543
## Training Details
### Training Data
<!-- This is a short stub of information on the training data that was used, and documentation related to data pre-processing or additional filtering (if applicable). -->
27K premise-hypothesis pairs data with entailment and contraction labels
### Training Procedure
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
#### Training Hyperparameters
<!-- This is a summary of the values of hyperparameters used in training the model. -->
      - learning_rate: 2e-05
      - train_batch_size: 8
      - eval_batch_size: 8
      - weighted_decay=0.0002
      - num_epochs: 2

#### Speeds, Sizes, Times

<!-- This section provides information about how roughly how long it takes to train the model and the size of the resulting model. -->


      - overall training time: 30mins
      - duration per training epoch: 15mins
      - model size: 1.7GB

## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

### Testing Data & Metrics

#### Testing Data

<!-- This should describe any evaluation data used (e.g., the development/validation set provided). -->

A subset of the development set provided, amounting to 6.7K pairs.

#### Metrics

<!-- These are the evaluation metrics being used. -->


      - Macro-p:0.928
      - Macro-r:0.927
      - Macro-F1:0.927
      - W_Macro-p:0.928
      - W_Macro-r:0.928
      - W_Macro-F1:0.928
      - Mcc:0.855
### Results
The model obtained an F1-score of 93% and an MCC of 86%.
## Technical Specifications
### Hardware
      - RAM: at least 16 GB
      - Storage: at least 2GB,
      - GPU: V100
### Software
      - Transformers 4.18.0
      - Pytorch 1.11.0+cu113
## Bias, Risks, and Limitations
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
Any inputs (concatenation of two sequences) longer than
      512 subwords will be truncated by the model.
## Additional Information
<!-- Any other information that would be useful for other people to know. -->
The hyperparameters were determined by experimentation
      with different values.