sjrhuschlee commited on
Commit
a0a2cc0
1 Parent(s): 71e9c82

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +80 -0
README.md CHANGED
@@ -1,3 +1,83 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ datasets:
4
+ - squad_v2
5
+ - squad
6
+ language:
7
+ - en
8
+ library_name: transformers
9
+ pipeline_tag: question-answering
10
+ tags:
11
+ - question-answering
12
+ - squad
13
+ - squad_v2
14
+ - t5
15
  ---
16
+
17
+ # flan-t5-base for Extractive QA
18
+
19
+ This is the [flan-t5-base](https://huggingface.co/google/flan-t5-base) model, fine-tuned using the [SQuAD2.0](https://huggingface.co/datasets/squad_v2) dataset. It's been trained on question-answer pairs, including unanswerable questions, for the task of Extractive Question Answering.
20
+
21
+ **NOTE:** The `<cls>` token must be manually added to the beginning of the question for this model to work properly.
22
+ It uses the `<cls>` token to be able to make "no answer" predictions.
23
+ The t5 tokenizer does not automatically add this special token which is why it is added manually.
24
+
25
+ ## Overview
26
+ **Language model:** flan-t5-base
27
+ **Language:** English
28
+ **Downstream-task:** Extractive QA
29
+ **Training data:** SQuAD 2.0
30
+ **Eval data:** SQuAD 2.0
31
+ **Infrastructure**: 1x NVIDIA 3070
32
+
33
+ ## Model Usage
34
+ ```python
35
+ import torch
36
+ from transformers import(
37
+ AutoModelForQuestionAnswering,
38
+ AutoTokenizer,
39
+ pipeline
40
+ )
41
+ model_name = "sjrhuschlee/flan-t5-base-squad2"
42
+
43
+ # a) Using pipelines
44
+ nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
45
+ qa_input = {
46
+ 'question': f'{nlp.tokenizer.cls_token}Where do I live?', # '<cls>Where do I live?'
47
+ 'context': 'My name is Sarah and I live in London'
48
+ }
49
+ res = nlp(qa_input)
50
+ # {'score': 0.984, 'start': 30, 'end': 37, 'answer': ' London'}
51
+
52
+ # b) Load model & tokenizer
53
+ model = AutoModelForQuestionAnswering.from_pretrained(model_name)
54
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
55
+
56
+ question = f'{tokenizer.cls_token}Where do I live?' # '<cls>Where do I live?'
57
+ context = 'My name is Sarah and I live in London'
58
+ encoding = tokenizer(question, context, return_tensors="pt")
59
+ start_scores, end_scores = model(
60
+ encoding["input_ids"],
61
+ attention_mask=encoding["attention_mask"],
62
+ return_dict=False
63
+ )
64
+
65
+ all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0].tolist())
66
+ answer_tokens = all_tokens[torch.argmax(start_scores):torch.argmax(end_scores) + 1]
67
+ answer = tokenizer.decode(tokenizer.convert_tokens_to_ids(answer_tokens))
68
+ # 'London'
69
+ ```
70
+
71
+ ## Metrics
72
+
73
+ ```bash
74
+ # Squad v2
75
+
76
+ # Squad
77
+ ```
78
+
79
+ ## Training procedure
80
+
81
+ ### Training hyperparameters
82
+
83
+ The following hyperparameters were used during training: