WeightWatcher
/

albert-large-v2-rte

Text Classification

Inference Endpoints

Model card Files Files and versions Community

cdhinrichs commited on Aug 3, 2023

Commit

ae1ffbb

•

1 Parent(s): 65b13ff

Added a model card

Files changed (1) hide show

README.md +99 -0

README.md CHANGED Viewed

@@ -1,3 +1,102 @@
 ---
 license: mit
 ---

 ---
+language:
+  - "en"
 license: mit
+datasets:
+  - glue
+metrics:
+  - Classification accuracy
 ---
+# Model Card for cdhinrichs/albert-large-v2-rte
+This model was finetuned on the GLUE/rte task, based on the pretrained
+albert-large-v2 model. Hyperparameters were (largely) taken from the following
+publication, with some minor exceptions.
+ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
+https://arxiv.org/abs/1909.11942
+## Model Details
+### Model Description
+- **Developed by:** https://huggingface.co/cdhinrichs
+- **Model type:** Text Sequence Classification
+- **Language(s) (NLP):** English
+- **License:** MIT
+- **Finetuned from model:** https://huggingface.co/albert-large-v2
+## Uses
+Text classification, research and development.
+### Out-of-Scope Use
+Not intended for production use.
+See https://huggingface.co/albert-large-v2
+## Bias, Risks, and Limitations
+See https://huggingface.co/albert-large-v2
+### Recommendations
+See https://huggingface.co/albert-large-v2
+## How to Get Started with the Model
+Use the code below to get started with the model.
+```python
+from transformers import AlbertForSequenceClassification
+model = AlbertForSequenceClassification.from_pretrained("cdhinrichs/albert-large-v2-rte")
+```
+## Training Details
+### Training Data
+See https://huggingface.co/datasets/glue#rte
+RTE is a classification task, and a part of the GLUE benchmark.
+### Training Procedure
+Adam optimization was used on the pretrained ALBERT model at
+https://huggingface.co/albert-large-v2.
+A checkpoint from MNLI was NOT used, differing from footnote 4 in,
+ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
+https://arxiv.org/abs/1909.11942
+#### Training Hyperparameters
+Training hyperparameters, (Learning Rate, Batch Size, ALBERT dropout rate,
+Classifier Dropout Rate, Warmup Steps, Training Steps,) were taken from Table
+A.4 in,
+ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
+https://arxiv.org/abs/1909.11942
+Max sequence length (MSL) was set to 128, differing from the above.
+## Evaluation
+Classification accuracy is used to evaluate model performance.
+### Testing Data, Factors & Metrics
+#### Testing Data
+See https://huggingface.co/datasets/glue#rte
+#### Metrics
+Classification accuracy
+### Results
+Training Classification accuracy: 0.9971887550200803
+Evaluation Classification accuracy: 0.8014440433212996
+## Environmental Impact
+The model was finetuned on a single user workstation with a single GPU. CO2
+impact is expected to be minimal.