Model Card for WeightWatcher/albert-large-v2-qnli
This model was finetuned on the GLUE/qnli task, based on the pretrained albert-large-v2 model. Hyperparameters were (largely) taken from the following publication, with some minor exceptions.
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations https://arxiv.org/abs/1909.11942
Model Details
Model Description
- Developed by: https://huggingface.co/cdhinrichs
- Model type: Text Sequence Classification
- Language(s) (NLP): English
- License: MIT
- Finetuned from model: https://huggingface.co/albert-large-v2
Uses
Text classification, research and development.
Out-of-Scope Use
Not intended for production use. See https://huggingface.co/albert-large-v2
Bias, Risks, and Limitations
See https://huggingface.co/albert-large-v2
Recommendations
See https://huggingface.co/albert-large-v2
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import AlbertForSequenceClassification
model = AlbertForSequenceClassification.from_pretrained("WeightWatcher/albert-large-v2-qnli")
Training Details
Training Data
See https://huggingface.co/datasets/glue#qnli
QNLI is a classification task, and a part of the GLUE benchmark.
Training Procedure
Adam optimization was used on the pretrained ALBERT model at https://huggingface.co/albert-large-v2.
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations https://arxiv.org/abs/1909.11942
Training Hyperparameters
Training hyperparameters, (Learning Rate, Batch Size, ALBERT dropout rate, Classifier Dropout Rate, Warmup Steps, Training Steps,) were taken from Table A.4 in,
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations https://arxiv.org/abs/1909.11942
Max sequence length (MSL) was set to 128, differing from the above.
Evaluation
Classification accuracy is used to evaluate model performance.
Testing Data, Factors & Metrics
Testing Data
See https://huggingface.co/datasets/glue#qnli
Metrics
Classification accuracy
Results
Training Classification accuracy: 0.9997613205655748
Evaluation Classification accuracy: 0.9194581731649277
Environmental Impact
The model was finetuned on a single user workstation with a single GPU. CO2 impact is expected to be minimal.
- Downloads last month
- 30