Edit model card
YAML Metadata Error: "tags" must be an array

BatterySciBERT-cased for QA

Language model: batteryscibert-cased Language: English
Downstream-task: Extractive QA
Training data: SQuAD v1 Eval data: SQuAD v1 Code: See example Infrastructure: 8x DGX A100

Hyperparameters

batch_size = 32
n_epochs = 3
base_LM_model = "batteryscibert-cased"
max_seq_len = 386
learning_rate = 2e-5
doc_stride=128
max_query_length=64

Performance

Evaluated on the SQuAD v1.0 dev set.

"exact": 79.66,
"f1": 87.43,

Evaluated on the battery device dataset.

"precision": 65.09,
"recall": 84.56,

Usage

In Transformers

from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline

model_name = "batterydata/batteryscibert-cased-squad-v1"
# a) Get predictions
nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
QA_input = {
    'question': 'What is the electrolyte?',
    'context': 'The typical non-aqueous electrolyte for commercial Li-ion cells is a solution of LiPF6 in linear and cyclic carbonates.'
}
res = nlp(QA_input)
# b) Load model & tokenizer
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

Authors

Shu Huang: sh2009 [at] cam.ac.uk

Jacqueline Cole: jmc61 [at] cam.ac.uk

Citation

BatteryBERT: A Pre-trained Language Model for Battery Database Enhancement

Downloads last month
3
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train batterydata/batteryscibert-cased-squad-v1