batterydata
/

bert-base-uncased-abstract

Text Classification

Text Classification

Inference Endpoints

Model card Files Files and versions Community

batterydata commited on Mar 5, 2022

Commit

383638f

•

1 Parent(s): b6c9b2e

Update README.md

Files changed (1) hide show

README.md +49 -0

README.md CHANGED Viewed

@@ -1,3 +1,52 @@
 ---
 license: apache-2.0
 ---

 ---
+language: en
+tags: Text Classification
 license: apache-2.0
+datasets:
+- batterydata/paper-abstracts
+metrics: glue
 ---
+# BERT-base-uncased for Battery Abstract Classification
+**Language model:** bert-base-uncased
+**Language:** English
+**Downstream-task:** Text Classification
+**Training data:** training\_data.csv
+**Eval data:** val\_data.csv
+**Code:**  See [example](https://github.com/ShuHuang/batterybert)
+**Infrastructure**: 8x DGX A100
+## Hyperparameters
+```
+batch_size = 32
+n_epochs = 13
+base_LM_model = "bert-base-uncased"
+learning_rate = 2e-5
+```
+## Performance
+```
+"Validation accuracy": 96.79,
+"Test accuracy": 96.29,
+```
+## Usage
+### In Transformers
+```python
+from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
+model_name = "batterydata/bert-base-uncased-abstract"
+# a) Get predictions
+nlp = pipeline('text-classification', model=model_name, tokenizer=model_name)
+input = {'The typical non-aqueous electrolyte for commercial Li-ion cells is a solution of LiPF6 in linear and cyclic carbonates.'}
+res = nlp(input)
+# b) Load model & tokenizer
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+```
+## Authors
+Shu Huang: `sh2009 [at] cam.ac.uk`
+Jacqueline Cole: `jmc61 [at] cam.ac.uk`
+## Citation
+BatteryBERT: A Pre-trained Language Model for Battery Database Enhancement