File size: 2,918 Bytes
d52c2ba
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
---
license: apache-2.0
datasets:
- squad_v2
language:
- en
library_name: transformers
pipeline_tag: text-classification
inference: false
---
# longformer-large-4096 fine-tuned to SQuAD2.0 for answerability score
This model determines whether the question is answerable (or unanswerable) given the context. 
The output is a probability where values close to 0.0 indicate that the question is unanswerable and values close to 1.0 means answerable.

- Input: `question` and `context`
- Output: `probability` (i.e. logit -> sigmoid)

## Model Details

longformer-large-4096 model is fine-tuned to the SQuAD2.0 dataset where the input is a concatenation of ```question + context```. 
Due to class imbalance in SQuAD2.0, we resample such that the model is trained on a 50/50 split between answerable and unanswerable samples in SQuAD2.0.

## How to Use the Model

Use the code below to get started with the model. 

```python
>>> import torch
>>> from transformers import LongformerTokenizer, LongformerForSequenceClassification

>>> tokenizer = LongformerTokenizer.from_pretrained("potsawee/longformer-large-4096-answerable-squad2")
>>> model = LongformerForSequenceClassification.from_pretrained("potsawee/longformer-large-4096-answerable-squad2")

>>> context = """
British government ministers have been banned from using Chinese-owned social media app TikTok on their work phones and devices on security grounds.
The government fears sensitive data held on official phones could be accessed by the Chinese government.
Cabinet Minister Oliver Dowden said the ban was a "precautionary" move but would come into effect immediately.
""".replace("\n", " ").strip()

>>> question1   = "Which application have been banned by the British government?"
>>> input_text1 = question1 + ' ' + tokenizer.sep_token + ' ' + context
>>> inputs1     = tokenizer(input_text1, max_length=4096, truncation=True, return_tensors="pt")
>>> prob1 = torch.sigmoid(model(**inputs1).logits.squeeze(-1))
>>> print("P(answerable|question1, context) = {:.2f}%".format(prob1.item()*100))
P(answerable|question1, context) = 99.21% # highly answerable

>>> question2   = "Is Facebook popular among young students in America?"
>>> input_text2 = question2 + ' ' + tokenizer.sep_token + ' ' + context
>>> inputs2     = tokenizer(input_text2, max_length=4096, truncation=True, return_tensors="pt")
>>> prob2 = torch.sigmoid(model(**inputs2).logits.squeeze(-1))
>>> print("P(answerable|question2, context) = {:.2f}%".format(prob2.item()*100))
P(answerable|question2, context) = 2.53% # highly unanswerable
```

## Citation

```bibtex
@misc{manakul2023selfcheckgpt,
      title={SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models}, 
      author={Potsawee Manakul and Adian Liusie and Mark J. F. Gales},
      year={2023},
      eprint={2303.08896},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```