language: en license: cc-by-4.0 tags:
- text-classification repo: https://huggingface.co/booyu/DeBERTa-v3-large_finetune
Model Card for j72446cx-n35081bw-NLI
This is a pair classification model that was trained to determine whether the given “hypothesis” logically follows from the “premise.
Model Details
Model Description
This model is based upon a DeBERTa-v3 model that was fine-tuned on 27K pairs of texts.
- Developed by: Boyu Wei and Changyi Xin
- Language(s): English
- Model type: Supervised
- Model architecture: Transformers
- Finetuned from model [optional]: DeBERTa-v3-large
Model Resources
- Repository: https://huggingface.co/microsoft/deberta-v3-large
- Paper or documentation: https://arxiv.org/abs/2111.09543
Training Details
Training Data
27K premise-hypothesis pairs data with entailment and contraction labels
Training Procedure
Training Hyperparameters
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- weighted_decay=0.0002
- num_epochs: 2
Speeds, Sizes, Times
- overall training time: 30mins
- duration per training epoch: 15mins
- model size: 1.7GB
Evaluation
Testing Data & Metrics
Testing Data
A subset of the development set provided, amounting to 6.7K pairs.
Metrics
- Macro-p:0.928
- Macro-r:0.927
- Macro-F1:0.927
- W_Macro-p:0.928
- W_Macro-r:0.928
- W_Macro-F1:0.928
- Mcc:0.855
Results
The model obtained an F1-score of 93% and an MCC of 86%.
Technical Specifications
Hardware
- RAM: at least 16 GB
- Storage: at least 2GB,
- GPU: V100
Software
- Transformers 4.18.0
- Pytorch 1.11.0+cu113
Bias, Risks, and Limitations
Any inputs (concatenation of two sequences) longer than 512 subwords will be truncated by the model.
Additional Information
The hyperparameters were determined by experimentation with different values.
- Downloads last month
- 7