Edit model card

flan-t5-xl for Extractive QA

This is the flan-t5-xl model, fine-tuned using the SQuAD2.0 dataset. It's been trained on question-answer pairs, including unanswerable questions, for the task of Extractive Question Answering.

Overview

Language model: flan-t5-xl
Language: English
Downstream-task: Extractive QA
Training data: SQuAD 2.0
Eval data: SQuAD 2.0
Code: See an example QA pipeline on Haystack

Hyperparameters

learning_rate: 1e-05
train_batch_size: 4
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 16
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 4.0

Usage

In Haystack

Haystack is an NLP framework by deepset. You can use this model in a Haystack pipeline to do extractive question answering at scale (over many documents). To load the model in Haystack:

# NOTE: This only works with Haystack v2.0!
reader = ExtractiveReader("deepset/flan-t5-xl-squad2")

In Transformers

from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline

model_name = "deepset/flan-t5-xl-squad2"

# a) Get predictions
nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
QA_input = {
    'question': 'Why is model conversion important?',
    'context': 'The option to convert models between FARM and transformers gives freedom to the user and let people easily switch between frameworks.'
}
res = nlp(QA_input)

# b) Load model & tokenizer
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

Authors

Sebastian Husch Lee: sebastian.huschlee [at] deepset.ai

About us

deepset is the company behind the open-source NLP framework Haystack which is designed to help you build production ready NLP systems that use: Question answering, summarization, ranking etc.

Get in touch and join the Haystack community

For more info on Haystack, visit our GitHub repo and Documentation.

We also have a Discord community open to everyone!

Twitter | LinkedIn | Discord | GitHub Discussions | Website

Downloads last month
69
Safetensors
Model size
2.78B params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for deepset/flan-t5-xl-squad2

Base model

google/flan-t5-xl
Finetuned
this model

Datasets used to train deepset/flan-t5-xl-squad2

Space using deepset/flan-t5-xl-squad2 1

Evaluation results