ArvinZhuang commited on
Commit
d7aa233
1 Parent(s): 28d30a6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +79 -0
README.md CHANGED
@@ -1,3 +1,82 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: text2text-generation
5
+
6
+ inference:
7
+ parameters:
8
+ do_sample: true
9
+ max_length: 64
10
+ top_k: 10
11
+ temperature: 1
12
+ num_return_sequences: 10
13
+ widget:
14
+ - text: >-
15
+ Generate a Japanese question for this passage: Transformer (machine learning model) A transformer is a deep learning model that adopts the mechanism of self-attention, differentially weighting the significance of each part of the input (which includes the recursive output) data.
16
+ example_title: Generate Japanese questions
17
+
18
+ - text: >-
19
+ Generate a Arabic question for this passage: Transformer (machine learning model) A transformer is a deep learning model that adopts the mechanism of self-attention, differentially weighting the significance of each part of the input (which includes the recursive output) data.
20
+ example_title: Generate Arabic questions
21
  ---
22
+
23
+ ## Model description
24
+
25
+ mT5-large query generation model that is trained with XOR QA data.
26
+
27
+ Used in paper [Bridging the Gap Between Indexing and Retrieval for
28
+ Differentiable Search Index with Query Generation](https://arxiv.org/pdf/2206.10128.pdf)
29
+
30
+ and [Augmenting Passage Representations with Query Generation
31
+ for Enhanced Cross-Lingual Dense Retrieval]()
32
+
33
+ ### How to use
34
+ ```python
35
+ from transformers import pipeline
36
+
37
+ lang2mT5 = dict(
38
+ ar='Arabic',
39
+ bn='Bengali',
40
+ fi='Finnish',
41
+ ja='Japanese',
42
+ ko='Korean',
43
+ ru='Russian',
44
+ te='Telugu'
45
+ )
46
+ PROMPT = 'Generate a {lang} question for this passage: {title} {passage}'
47
+
48
+ title = 'Transformer (machine learning model)'
49
+ passage = 'A transformer is a deep learning model that adopts the mechanism of self-attention, differentially ' \
50
+ 'weighting the significance of each part of the input (which includes the recursive output) data.'
51
+
52
+
53
+ model_name_or_path = 'ielabgroup/xor-tydi-docTquery-mt5-base'
54
+ input_text = PROMPT.format_map({'lang': lang2mT5['ja'],
55
+ 'title': title,
56
+ 'passage': passage})
57
+
58
+ generator = pipeline(model=model_name_or_path,
59
+ task='text2text-generation',
60
+ device="cuda:0",
61
+ )
62
+
63
+ results = generator(input_text,
64
+ do_sample=True,
65
+ max_length=64,
66
+ num_return_sequences=10,
67
+ )
68
+
69
+ for i, result in enumerate(results):
70
+ print(f'{i + 1}. {result["generated_text"]}')
71
+ ```
72
+
73
+ ### BibTeX entry and citation info
74
+
75
+ ```bibtex
76
+ @article{zhuang2022bridging,
77
+ title={Bridging the gap between indexing and retrieval for differentiable search index with query generation},
78
+ author={Zhuang, Shengyao and Ren, Houxing and Shou, Linjun and Pei, Jian and Gong, Ming and Zuccon, Guido and Jiang, Daxin},
79
+ journal={arXiv preprint arXiv:2206.10128},
80
+ year={2022}
81
+ }
82
+ ```