--- {} --- language: en license: cc-by-4.0 tags: - text-classification repo: https://huggingface.co/booyu/DeBERTa-v3-large_finetune --- # Model Card for j72446cx-n35081bw-NLI This is a pair classification model that was trained to determine whether the given “hypothesis” logically follows from the “premise. ## Model Details ### Model Description This model is based upon a DeBERTa-v3 model that was fine-tuned on 27K pairs of texts. - **Developed by:** Boyu Wei and Changyi Xin - **Language(s):** English - **Model type:** Supervised - **Model architecture:** Transformers - **Finetuned from model [optional]:** DeBERTa-v3-large ### Model Resources - **Repository:** https://huggingface.co/microsoft/deberta-v3-large - **Paper or documentation:** https://arxiv.org/abs/2111.09543 ## Training Details ### Training Data 27K premise-hypothesis pairs data with entailment and contraction labels ### Training Procedure #### Training Hyperparameters - learning_rate: 2e-05 - train_batch_size: 8 - eval_batch_size: 8 - weighted_decay=0.0002 - num_epochs: 2 #### Speeds, Sizes, Times - overall training time: 30mins - duration per training epoch: 15mins - model size: 1.7GB ## Evaluation ### Testing Data & Metrics #### Testing Data A subset of the development set provided, amounting to 6.7K pairs. #### Metrics - Macro-p:0.928 - Macro-r:0.927 - Macro-F1:0.927 - W_Macro-p:0.928 - W_Macro-r:0.928 - W_Macro-F1:0.928 - Mcc:0.855 ### Results The model obtained an F1-score of 93% and an MCC of 86%. ## Technical Specifications ### Hardware - RAM: at least 16 GB - Storage: at least 2GB, - GPU: V100 ### Software - Transformers 4.18.0 - Pytorch 1.11.0+cu113 ## Bias, Risks, and Limitations Any inputs (concatenation of two sequences) longer than 512 subwords will be truncated by the model. ## Additional Information The hyperparameters were determined by experimentation with different values.