ielabgroup
/

xor-tydi-docTquery-mt5-large

Text2Text Generation

Inference Endpoints

Model card Files Files and versions Community

ArvinZhuang commited on May 7, 2023

Commit

d7aa233

•

1 Parent(s): 28d30a6

Update README.md

Files changed (1) hide show

README.md +79 -0

README.md CHANGED Viewed

@@ -1,3 +1,82 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
+library_name: transformers
+pipeline_tag: text2text-generation
+inference:
+  parameters:
+    do_sample: true
+    max_length: 64
+    top_k: 10
+    temperature: 1
+    num_return_sequences: 10
+widget:
+  - text: >-
+      Generate a Japanese question for this passage: Transformer (machine learning model) A transformer is a deep learning model that adopts the mechanism of self-attention, differentially weighting the significance of each part of the input (which includes the recursive output) data.
+      example_title: Generate Japanese questions
+  - text: >-
+      Generate a Arabic question for this passage: Transformer (machine learning model) A transformer is a deep learning model that adopts the mechanism of self-attention, differentially weighting the significance of each part of the input (which includes the recursive output) data.
+      example_title: Generate Arabic questions
 ---
+## Model description
+mT5-large query generation model that is trained with XOR QA data.
+Used in paper [Bridging the Gap Between Indexing and Retrieval for
+Differentiable Search Index with Query Generation](https://arxiv.org/pdf/2206.10128.pdf)
+and [Augmenting Passage Representations with Query Generation
+for Enhanced Cross-Lingual Dense Retrieval]()
+### How to use
+```python
+from transformers import pipeline
+lang2mT5 = dict(
+    ar='Arabic',
+    bn='Bengali',
+    fi='Finnish',
+    ja='Japanese',
+    ko='Korean',
+    ru='Russian',
+    te='Telugu'
+)
+PROMPT = 'Generate a {lang} question for this passage: {title} {passage}'
+title = 'Transformer (machine learning model)'
+passage = 'A transformer is a deep learning model that adopts the mechanism of self-attention, differentially ' \
+          'weighting the significance of each part of the input (which includes the recursive output) data.'
+model_name_or_path = 'ielabgroup/xor-tydi-docTquery-mt5-base'
+input_text = PROMPT.format_map({'lang': lang2mT5['ja'],
+                                'title': title,
+                                'passage': passage})
+generator = pipeline(model=model_name_or_path,
+                     task='text2text-generation',
+                     device="cuda:0",
+                     )
+results = generator(input_text,
+                    do_sample=True,
+                    max_length=64,
+                    num_return_sequences=10,
+                    )
+for i, result in enumerate(results):
+    print(f'{i + 1}. {result["generated_text"]}')
+```
+### BibTeX entry and citation info
+```bibtex
+@article{zhuang2022bridging,
+  title={Bridging the gap between indexing and retrieval for differentiable search index with query generation},
+  author={Zhuang, Shengyao and Ren, Houxing and Shou, Linjun and Pei, Jian and Gong, Ming and Zuccon, Guido and Jiang, Daxin},
+  journal={arXiv preprint arXiv:2206.10128},
+  year={2022}
+}
+```