metadata
language:
- en
- es
- eu
datasets:
- squad
Description
This is a basic implementation of the multilingual model "ixambert-base-cased", fine-tuned on SQuAD version 1.1, that is able to answer basic factual questions in English, Spanish and Basque.
Outputs
The model predicts a span of text from the context and a score for the probability for that span to be the correct answer. For example:
{'score': 0.9667195081710815, 'start': 101, 'end': 105, 'answer': '1820'}
How to use
The model can be used directly with a question-answering pipeline:
from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
model_name = "MarcBrun/ixambert-finetuned-squad"
# To get predictions
context = "Florence Nightingale, known for being the founder of modern nursing, was born in Florence, Italy, in 1820"
question = "When was Florence Nightingale born?"
qa = pipeline("question-answering", model=model_name, tokenizer=model_name)
pred = qa(question=question,context=context)
# To load the model and tokenizer
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
Hyperparameters
batch_size = 8
n_epochs = 3
base_LM_model = "ixambert-base-cased"
learning_rate = 2e-5
optimizer = AdamW
lr_schedule = linear
max_seq_len = 384
doc_stride = 128