longformer-large-4096 fine-tuned to SQuAD2.0 for answerability score

This model determines whether the question is answerable (or unanswerable) given the context. The output is a probability where values close to 0.0 indicate that the question is unanswerable and values close to 1.0 means answerable.

Input: question and context
Output: probability (i.e. logit -> sigmoid)

Model Details

longformer-large-4096 model is fine-tuned to the SQuAD2.0 dataset where the input is a concatenation of question + context. Due to class imbalance in SQuAD2.0, we resample such that the model is trained on a 50/50 split between answerable and unanswerable samples in SQuAD2.0.

How to Use the Model

Use the code below to get started with the model.

>>> import torch
>>> from transformers import LongformerTokenizer, LongformerForSequenceClassification

>>> tokenizer = LongformerTokenizer.from_pretrained("potsawee/longformer-large-4096-answerable-squad2")
>>> model = LongformerForSequenceClassification.from_pretrained("potsawee/longformer-large-4096-answerable-squad2")

>>> context = """
British government ministers have been banned from using Chinese-owned social media app TikTok on their work phones and devices on security grounds.
The government fears sensitive data held on official phones could be accessed by the Chinese government.
Cabinet Minister Oliver Dowden said the ban was a "precautionary" move but would come into effect immediately.
""".replace("\n", " ").strip()

>>> question1   = "Which application have been banned by the British government?"
>>> input_text1 = question1 + ' ' + tokenizer.sep_token + ' ' + context
>>> inputs1     = tokenizer(input_text1, max_length=4096, truncation=True, return_tensors="pt")
>>> prob1 = torch.sigmoid(model(**inputs1).logits.squeeze(-1))
>>> print("P(answerable|question1, context) = {:.2f}%".format(prob1.item()*100))
P(answerable|question1, context) = 99.21% # highly answerable

>>> question2   = "Is Facebook popular among young students in America?"
>>> input_text2 = question2 + ' ' + tokenizer.sep_token + ' ' + context
>>> inputs2     = tokenizer(input_text2, max_length=4096, truncation=True, return_tensors="pt")
>>> prob2 = torch.sigmoid(model(**inputs2).logits.squeeze(-1))
>>> print("P(answerable|question2, context) = {:.2f}%".format(prob2.item()*100))
P(answerable|question2, context) = 2.53% # highly unanswerable

Citation

@misc{manakul2023selfcheckgpt,
      title={SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models}, 
      author={Potsawee Manakul and Adian Liusie and Mark J. F. Gales},
      year={2023},
      eprint={2303.08896},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

potsawee
/

longformer-large-4096-answerable-squad2

longformer-large-4096 fine-tuned to SQuAD2.0 for answerability score

Model Details

How to Use the Model

Citation

Dataset used to train potsawee/longformer-large-4096-answerable-squad2