Edit model card

language: en license: cc-by-4.0 tags:


Model Card for j72446cx-n35081bw-NLI

This is a pair classification model that was trained to determine whether the given “hypothesis” logically follows from the “premise.

Model Details

Model Description

This model is based upon a DeBERTa-v3 model that was fine-tuned on 27K pairs of texts.

  • Developed by: Boyu Wei and Changyi Xin
  • Language(s): English
  • Model type: Supervised
  • Model architecture: Transformers
  • Finetuned from model [optional]: DeBERTa-v3-large

Model Resources

Training Details

Training Data

27K premise-hypothesis pairs data with entailment and contraction labels

Training Procedure

Training Hyperparameters

  - learning_rate: 2e-05
  - train_batch_size: 8
  - eval_batch_size: 8
  - weighted_decay=0.0002
  - num_epochs: 2

Speeds, Sizes, Times

  - overall training time: 30mins
  - duration per training epoch: 15mins
  - model size: 1.7GB

Evaluation

Testing Data & Metrics

Testing Data

A subset of the development set provided, amounting to 6.7K pairs.

Metrics

  - Macro-p:0.928
  - Macro-r:0.927
  - Macro-F1:0.927
  - W_Macro-p:0.928
  - W_Macro-r:0.928
  - W_Macro-F1:0.928
  - Mcc:0.855

Results

The model obtained an F1-score of 93% and an MCC of 86%.

Technical Specifications

Hardware

  - RAM: at least 16 GB
  - Storage: at least 2GB,
  - GPU: V100

Software

  - Transformers 4.18.0
  - Pytorch 1.11.0+cu113

Bias, Risks, and Limitations

Any inputs (concatenation of two sequences) longer than 512 subwords will be truncated by the model.

Additional Information

The hyperparameters were determined by experimentation with different values.

Downloads last month
7
Safetensors
Model size
435M params
Tensor type
F32
·