WeightWatcher
/

albert-large-v2-sst2

Text Classification

Inference Endpoints

Model card Files Files and versions Community

cdhinrichs commited on Aug 3, 2023

Commit

127c813

•

1 Parent(s): 07bbebf

Added a model card

Files changed (1) hide show

README.md +97 -0

README.md CHANGED Viewed

@@ -1,3 +1,100 @@
 ---
 license: mit
 ---

 ---
+language:
+  - "en"
 license: mit
+datasets:
+  - glue
+metrics:
+  - Classification accuracy
 ---
+# Model Card for cdhinrichs/albert-large-v2-sst2
+This model was finetuned on the GLUE/sst2 task, based on the pretrained
+albert-large-v2 model. Hyperparameters were (largely) taken from the following
+publication, with some minor exceptions.
+ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
+https://arxiv.org/abs/1909.11942
+## Model Details
+### Model Description
+- **Developed by:** https://huggingface.co/cdhinrichs
+- **Model type:** Text Sequence Classification
+- **Language(s) (NLP):** English
+- **License:** MIT
+- **Finetuned from model:** https://huggingface.co/albert-large-v2
+## Uses
+Text classification, research and development.
+### Out-of-Scope Use
+Not intended for production use.
+See https://huggingface.co/albert-large-v2
+## Bias, Risks, and Limitations
+See https://huggingface.co/albert-large-v2
+### Recommendations
+See https://huggingface.co/albert-large-v2
+## How to Get Started with the Model
+Use the code below to get started with the model.
+```python
+from transformers import AlbertForSequenceClassification
+model = AlbertForSequenceClassification.from_pretrained("cdhinrichs/albert-large-v2-sst2")
+```
+## Training Details
+### Training Data
+See https://huggingface.co/datasets/glue#sst2
+SST2 is a classification task, and a part of the GLUE benchmark.
+### Training Procedure
+Adam optimization was used on the pretrained ALBERT model at
+https://huggingface.co/albert-large-v2.
+ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
+https://arxiv.org/abs/1909.11942
+#### Training Hyperparameters
+Training hyperparameters, (Learning Rate, Batch Size, ALBERT dropout rate,
+Classifier Dropout Rate, Warmup Steps, Training Steps,) were taken from Table
+A.4 in,
+ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
+https://arxiv.org/abs/1909.11942
+Max sequence length (MSL) was set to 128, differing from the above.
+## Evaluation
+Classification accuracy is used to evaluate model performance.
+### Testing Data, Factors & Metrics
+#### Testing Data
+See https://huggingface.co/datasets/glue#sst2
+#### Metrics
+Classification accuracy
+### Results
+Training Classification accuracy: 0.9990794221146565
+Evaluation Classification accuracy: 0.9461009174311926
+## Environmental Impact
+The model was finetuned on a single user workstation with a single GPU. CO2
+impact is expected to be minimal.