--- language: - "en" license: mit datasets: - glue metrics: - Classification accuracy --- # Model Card for WeightWatcher/albert-large-v2-qnli This model was finetuned on the GLUE/qnli task, based on the pretrained albert-large-v2 model. Hyperparameters were (largely) taken from the following publication, with some minor exceptions. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations https://arxiv.org/abs/1909.11942 ## Model Details ### Model Description - **Developed by:** https://huggingface.co/cdhinrichs - **Model type:** Text Sequence Classification - **Language(s) (NLP):** English - **License:** MIT - **Finetuned from model:** https://huggingface.co/albert-large-v2 ## Uses Text classification, research and development. ### Out-of-Scope Use Not intended for production use. See https://huggingface.co/albert-large-v2 ## Bias, Risks, and Limitations See https://huggingface.co/albert-large-v2 ### Recommendations See https://huggingface.co/albert-large-v2 ## How to Get Started with the Model Use the code below to get started with the model. ```python from transformers import AlbertForSequenceClassification model = AlbertForSequenceClassification.from_pretrained("WeightWatcher/albert-large-v2-qnli") ``` ## Training Details ### Training Data See https://huggingface.co/datasets/glue#qnli QNLI is a classification task, and a part of the GLUE benchmark. ### Training Procedure Adam optimization was used on the pretrained ALBERT model at https://huggingface.co/albert-large-v2. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations https://arxiv.org/abs/1909.11942 #### Training Hyperparameters Training hyperparameters, (Learning Rate, Batch Size, ALBERT dropout rate, Classifier Dropout Rate, Warmup Steps, Training Steps,) were taken from Table A.4 in, ALBERT: A Lite BERT for Self-supervised Learning of Language Representations https://arxiv.org/abs/1909.11942 Max sequence length (MSL) was set to 128, differing from the above. ## Evaluation Classification accuracy is used to evaluate model performance. ### Testing Data, Factors & Metrics #### Testing Data See https://huggingface.co/datasets/glue#qnli #### Metrics Classification accuracy ### Results Training Classification accuracy: 0.9997613205655748 Evaluation Classification accuracy: 0.9194581731649277 ## Environmental Impact The model was finetuned on a single user workstation with a single GPU. CO2 impact is expected to be minimal.