michal-stefanik commited on
Commit
15ace39
1 Parent(s): 640a529

Readme examples

Browse files
Files changed (1) hide show
  1. README.md +19 -15
README.md CHANGED
@@ -5,19 +5,28 @@ language:
5
  - multilingual
6
  - cs
7
  - en
 
 
 
 
 
 
 
 
 
8
  ---
9
 
10
- # Mt5-large for Prime Czech+English Generative Question Answering
11
 
12
- This is the [mt5-base](https://huggingface.co/google/mt5-base) model with an LM head for a generation of extractive answers,
13
  given a small set of 2-5 demonstrations (i.e. primes).
14
 
15
- ## Priming
16
 
17
- Note that **this is a priming model** that expects a **set of demonstrations** of your task of interest,
18
  similarly to GPT-3.
19
  Rather than performing well on the conventional question answering, it aims to learn to extrapolate the pattern of given demonstrations
20
- to novel tasks, such as Named Entity Recognition or Keywords Extraction from a given pattern.
21
 
22
  ## Data & Training
23
 
@@ -29,10 +38,10 @@ To train the model to use the demonstrations, we've **clustered** the samples by
29
  in English AdversarialQA and by the category in the Czech SQAD and used the examples of the same cluster as the demonstrations
30
  of the task in training.
31
 
32
- We find that the specific algorithm of selection of these demonstrations makes a big difference in the model's ability to extrapolate
33
- to new tasks and will be shared in the following article; stay tuned!
34
 
35
- For the Czech SQAD 3.0, original contexts (=whole Wikipedia websites) were limited to a maximum of 8000 characters
36
  per a sequence of prime demonstrations.
37
  Pre-processing script for Czech SQAD is available [here](https://huggingface.co/gaussalgo/xlm-roberta-large_extractive-QA_en-cs/blob/main/parse_czech_squad.py).
38
 
@@ -88,11 +97,6 @@ input_text = """
88
  Context: Customer id: Barrack Obama, if not deliverable, return to Bill Clinton.
89
  Answer:"""
90
  ```
91
-
92
- Note that despite its size, English AdversarialQA has a variety of reported biases,
93
- conditioned by the relative position or type of the answer in the context that can affect the model's performance on new data
94
- (see, e.g. [L. Mikula (2022)](https://is.muni.cz/th/adh58/?lang=en), Chap. 4.1).
95
-
96
  ## Usage
97
 
98
  Here is how to use this model to answer the question on a given context using 🤗 Transformers in PyTorch:
@@ -100,8 +104,8 @@ Here is how to use this model to answer the question on a given context using
100
  ```python
101
  from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
102
 
103
- tokenizer = AutoTokenizer.from_pretrained("gaussalgo/mt5-base-priming-QA_en-cs")
104
- model = AutoModelForSeq2SeqLM.from_pretrained("gaussalgo/mt5-base-priming-QA_en-cs")
105
 
106
  # For the expected format of input_text, see Intended use above
107
  inputs = tokenizer(input_text, return_tensors="pt")
 
5
  - multilingual
6
  - cs
7
  - en
8
+ widget:
9
+ - text: "Otázka: Jaký je důvod dotazu zákazníka?\nKontext: Dobrý den, Žádáme zaslání nové smlouvy kvůli řešení pojistné události. Zašlete na tento mail nebo přímo do systému. S pozdravem Petra Hladká | disponentka servisu.\nOdpověď: řešení pojistné události\nOtázka: Jaký je důvod dotazu zákazníka?\nKontext: Dobrý den, chtěla bych Vás požádat o zaslání kopie technického průkazu z důvodu jeho ztráty. S pozdravem Milan Tvrdý.\nOdpověď:"
10
+ example_title: "Few-shot: Customer request (cs)"
11
+ - text: "Otázka: Jaké schopnosti daly magické předměty Jurovi Jánošíkovi? \nKontext: Podle slovenského lidového podání byl Juro Jánošík obdařen magickými předměty (kouzelná valaška, čarovný opasek), které mu dodávaly nadpřirozené schopnosti. Okrádal především šlechtice, trestal panské dráby a ze svého lupu vyděloval část pro chudé, tedy bohatým bral a chudým dával. \nOdpověď:"
12
+ example_title: "Zero-shot: Question Answering (cs)"
13
+ - text: "Question: What is the score of this review? \n Context: I did not like the plot at all. Not recommended. \n Answer: 1 \n Question: What is the score of this review? \n Context: I loved the performance. Can’t believe they did not use CGI for the finale. I think it’s my new favourite movie. \nAnswer: 5 \nQuestion: Is the score of this review 1, 2, 3, 4 or 5? \nContext: The beginning was awesome, but at the end it felt a little rushed. I enjoyed the movie, but probably won’t rewatch soon. \nAnswer:"
14
+ example_title: "Few-shot: Movie reviews (en)"
15
+ - text: "Question: What is the score of this review? \n Context: I did not like the plot at all. Not recommended. \n Answer: 1 \n Question: What is the score of this review? \n Context: I loved the performance. Can’t believe they did not use CGI for the finale. I think it’s my new favourite movie. \nAnswer: 5 \nQuestion: Is the score of this review 1, 2, 3, 4 or 5? \nContext: The beginning was awesome, but at the end it felt a little rushed. I enjoyed the movie, but probably won’t rewatch soon. \nAnswer:"
16
+ example_title: "Few-shot: Customer request (en)"
17
  ---
18
 
19
+ # Mt5-large for Few-shot Czech+English Generative Question Answering
20
 
21
+ This is the [mt5-large](https://huggingface.co/google/mt5-large) model with an LM head for a generation of extractive answers,
22
  given a small set of 2-5 demonstrations (i.e. primes).
23
 
24
+ ## Few-shot (i.e. priming)
25
 
26
+ Note that **this is primarily a few-shot model** that expects a **set of demonstrations** of your task of interest,
27
  similarly to GPT-3.
28
  Rather than performing well on the conventional question answering, it aims to learn to extrapolate the pattern of given demonstrations
29
+ to novel tasks, such as Named Entity Recognition or Keywords Extraction from a given pattern. However, it can be also used as conventional QA model (see examples).
30
 
31
  ## Data & Training
32
 
 
38
  in English AdversarialQA and by the category in the Czech SQAD and used the examples of the same cluster as the demonstrations
39
  of the task in training.
40
 
41
+ We find that the specific algorithm of selection of these demonstrations is crucial for the model's ability to extrapolate
42
+ to new tasks. We'll share more details in the following article; stay tuned!
43
 
44
+ For the Czech SQAD 3.0, original contexts (=whole Wikipedia websites) were limited to a maximum of 4000 characters
45
  per a sequence of prime demonstrations.
46
  Pre-processing script for Czech SQAD is available [here](https://huggingface.co/gaussalgo/xlm-roberta-large_extractive-QA_en-cs/blob/main/parse_czech_squad.py).
47
 
 
97
  Context: Customer id: Barrack Obama, if not deliverable, return to Bill Clinton.
98
  Answer:"""
99
  ```
 
 
 
 
 
100
  ## Usage
101
 
102
  Here is how to use this model to answer the question on a given context using 🤗 Transformers in PyTorch:
 
104
  ```python
105
  from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
106
 
107
+ tokenizer = AutoTokenizer.from_pretrained("gaussalgo/mt5-large-priming-QA_en-cs")
108
+ model = AutoModelForSeq2SeqLM.from_pretrained("gaussalgo/mt5-large-priming-QA_en-cs")
109
 
110
  # For the expected format of input_text, see Intended use above
111
  inputs = tokenizer(input_text, return_tensors="pt")