--- language: en tags: - text-classification - albert --- # Model Card for albert-base-rci-wikisql-col # Model Details ## Model Description More information needed - **Developed by:** Michael Glass - **Shared by [Optional]:** Michael Glass - **Model type:** Token Classification - **Language(s) (NLP):** English - **License:** More information needed - **Parent Model:** [ALBERT Base v2](https://huggingface.co/albert-base-v2?text=The+goal+of+life+is+%5BMASK%5D.) - **Resources for more information:** - [ALBERT Base GitHub Repo](https://github.com/jhyuklee/biobert) - [ALBERT Base Paper](https://github.com/google-research/albert) # Uses ## Direct Use This model can be used for the task of text classification. > This model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked) to make decisions, such as sequence classification, token classification or question answering. See [ALBERT Base v2 model card](https://huggingface.co/albert-base-v2?text=The+goal+of+life+is+%5BMASK%5D.) for more information. ## Downstream Use [Optional] More information needed. ## Out-of-Scope Use The model should not be used to intentionally create hostile or alienating environments for people. For tasks such as text generation you should look at model like GPT2. # Bias, Risks, and Limitations Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups. ## Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. # Training Details ## Training Data The ALBERT model was pretrained on [BookCorpus](https://yknzhu.wixsite.com/mbweb), a dataset consisting of 11,038 unpublished books and [English] Wikipedia(https://en.wikipedia.org/wiki/English_Wikipedia) (excluding lists, tables and headers). See [ALBERT Base v2 model card](https://huggingface.co/albert-base-v2?text=The+goal+of+life+is+%5BMASK%5D.) for more information. ## Training Procedure ### Preprocessing >The texts are lowercased and tokenized using SentencePiece and a vocabulary size of 30,000. The inputs of the model are then of the form: ``` [CLS] Sentence A [SEP] Sentence B [SEP] ``` See [ALBERT Base v2 model card](https://huggingface.co/albert-base-v2?text=The+goal+of+life+is+%5BMASK%5D.) for more information. ### Speeds, Sizes, Times More information needed # Evaluation ## Testing Data, Factors & Metrics ### Testing Data More information needed ### Factors More information needed ### Metrics More information needed ## Results More information needed # Model Examination More information needed # Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** More information needed - **Hours used:** More information needed - **Cloud Provider:** More information needed - **Compute Region:** More information needed - **Carbon Emitted:** More information needed # Technical Specifications [optional] ## Model Architecture and Objective More information needed ## Compute Infrastructure More information needed ### Hardware More information needed ### Software More information needed. # Citation **BibTeX:** ```bibtex @article{DBLP:journals/corr/abs-1909-11942, author = {Zhenzhong Lan and Mingda Chen and Sebastian Goodman and Kevin Gimpel and Piyush Sharma and Radu Soricut}, title = {{ALBERT:} {A} Lite {BERT} for Self-supervised Learning of Language Representations}, journal = {CoRR}, volume = {abs/1909.11942}, year = {2019}, url = {http://arxiv.org/abs/1909.11942}, archivePrefix = {arXiv}, eprint = {1909.11942}, timestamp = {Fri, 27 Sep 2019 13:04:21 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-1909-11942.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } ``` **APA:** More information needed # Glossary [optional] More information needed # More Information [optional] More information needed # Model Card Authors [optional] Michael Glass in collaboration with Ezi Ozoani and the Hugging Face team # Model Card Contact More information needed # How to Get Started with the Model Use the code below to get started with the model.
Click to expand ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("michaelrglass/albert-base-rci-wikisql-col") model = AutoModelForSequenceClassification.from_pretrained("michaelrglass/albert-base-rci-wikisql-col") ```