model documentation

#1
by nazneen - opened
Files changed (1) hide show
  1. README.md +200 -0
README.md ADDED
@@ -0,0 +1,200 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - text-2-text-generation
4
+ - t5
5
+ ---
6
+
7
+ # Model Card for qcpg-sentences
8
+
9
+ # Model Details
10
+
11
+ ## Model Description
12
+ The model creators note in the [associated paper](https://arxiv.org/pdf/2203.10940.pdf):
13
+ >Here we propose QCPG, a quality-guided controlled paraphrase generation model, that allows directly controlling the quality dimensions. Furthermore, we suggest a method that given a sentence, identifies points in the quality control space that are expected to yield optimal generated paraphrases. We show that our method is able to generate paraphrases which maintain the original meaning while achieving higher diversity than the uncontrolled baseline.
14
+
15
+ - **Developed by:** IBM
16
+ - **Shared by [Optional]:** IBM
17
+
18
+ - **Model type:** Text2Text Generation
19
+ - **Language(s) (NLP):** More information needed
20
+ - **License:** More information needed
21
+ - **Parent Model:** [All T5 Checkpoints](https://huggingface.co/models?search=t5)
22
+ - **Resources for more information:**
23
+ - [GitHub Repo](https://github.com/IBM/quality-controlled-paraphrase-generation)
24
+ - [Associated Paper](https://arxiv.org/pdf/2203.10940.pdf)
25
+
26
+
27
+ # Uses
28
+
29
+
30
+ ## Direct Use
31
+ This model can be used for the task of Text2Text generation.
32
+
33
+ ## Downstream Use [Optional]
34
+
35
+ More information needed.
36
+
37
+ ## Out-of-Scope Use
38
+
39
+ The model should not be used to intentionally create hostile or alienating environments for people.
40
+
41
+ # Bias, Risks, and Limitations
42
+
43
+
44
+ Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.
45
+
46
+
47
+
48
+ ## Recommendations
49
+
50
+
51
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
52
+
53
+ # Training Details
54
+
55
+ ## Training Data
56
+
57
+ The model creators note in the [associated paper](https://arxiv.org/pdf/2203.10940.pdf):
58
+ > These datasets are large but noisy, and contain only a relatively small amount of high quality paraphrases.
59
+ *MSCOCO:* This dataset consists of 123K im- ages, where each image contains at most five human-labeled captions (Lin et al., 2014). Similar to previous works we consider different captions of the same image as paraphrases.
60
+ *WikiAnswers (WikiAns for short):* The WikiAnswers corpus contains clusters of ques- tions tagged by wiki-answers.com users as similar. There are 30, 370, 994 clusters with 25 question in each on average. In total, the corpus contains over 70 million question pairs
61
+ *ParaBank2.0:* A dataset containing clusters of sentential paraphrases, produced from a bilingual corpus using negative constraints, inference sam- pling, and clustering. The dataset is composed of avarage of 5 paraphrases in every cluster and close to 100 million pairs in total.
62
+
63
+
64
+ ## Training Procedure
65
+
66
+
67
+ ### Preprocessing
68
+
69
+ The model creators note in the [associated paper](https://arxiv.org/pdf/2203.10940.pdf):
70
+ > To get comparable results across all datasets, we randomly sub-sampled ParaBank2.0 and WikiAns to the same size as MSCOCO, and split them to train, dev and test sets, of sizes 900K, 14K and 14K respectively. We carefully made sure that there are no pairs from the same cluster in differ- ent splits of the data.
71
+
72
+
73
+
74
+
75
+ ### Speeds, Sizes, Times
76
+
77
+ The model creators note in the [associated paper](https://arxiv.org/pdf/2203.10940.pdf):
78
+ > All models are trained with batch size of 32 on 2 NVIDIA A100 GPUs for 6 epochs.
79
+
80
+ # Evaluation
81
+
82
+
83
+ ## Testing Data, Factors & Metrics
84
+
85
+ ### Testing Data
86
+
87
+ More information needed
88
+
89
+ ### Factors
90
+ More information needed
91
+
92
+ ### Metrics
93
+
94
+ More information needed
95
+
96
+
97
+ ## Results
98
+
99
+ More information needed
100
+
101
+
102
+ # Model Examination
103
+
104
+ More information needed
105
+
106
+ # Environmental Impact
107
+
108
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
109
+
110
+ - **Hardware Type:** 2 NVIDIA A100
111
+ - **Hours used:** More information needed
112
+ - **Cloud Provider:** More information needed
113
+ - **Compute Region:** More information needed
114
+ - **Carbon Emitted:** More information needed
115
+
116
+ # Technical Specifications [optional]
117
+
118
+ ## Model Architecture and Objective
119
+
120
+ More information needed
121
+
122
+ ## Compute Infrastructure
123
+
124
+ More information needed
125
+
126
+ ### Hardware
127
+
128
+
129
+ More information needed
130
+
131
+ ### Software
132
+
133
+ More information needed.
134
+
135
+ # Citation
136
+
137
+
138
+ **BibTeX:**
139
+
140
+ More information needed
141
+ ```bibtex
142
+ @inproceedings{bandel-etal-2022-quality,
143
+ title = "Quality Controlled Paraphrase Generation",
144
+ author = "Bandel, Elron and
145
+ Aharonov, Ranit and
146
+ Shmueli-Scheuer, Michal and
147
+ Shnayderman, Ilya and
148
+ Slonim, Noam and
149
+ Ein-Dor, Liat",
150
+ booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
151
+ month = may,
152
+ year = "2022",
153
+ address = "Dublin, Ireland",
154
+ publisher = "Association for Computational Linguistics",
155
+ url = "https://aclanthology.org/2022.acl-long.45",
156
+ pages = "596--609",
157
+ abstract = "Paraphrase generation has been widely used in various downstream tasks. Most tasks benefit mainly from high quality paraphrases, namely those that are semantically similar to, yet linguistically diverse from, the original sentence. Generating high-quality paraphrases is challenging as it becomes increasingly hard to preserve meaning as linguistic diversity increases. Recent works achieve nice results by controlling specific aspects of the paraphrase, such as its syntactic tree. However, they do not allow to directly control the quality of the generated paraphrase, and suffer from low flexibility and scalability. Here we propose QCPG, a quality-guided controlled paraphrase generation model, that allows directly controlling the quality dimensions. Furthermore, we suggest a method that given a sentence, identifies points in the quality control space that are expected to yield optimal generated paraphrases. We show that our method is able to generate paraphrases which maintain the original meaning while achieving higher diversity than the uncontrolled baseline. The models, the code, and the data can be found in https://github.com/IBM/quality-controlled-paraphrase-generation.",
158
+ }
159
+
160
+ ```
161
+
162
+
163
+
164
+ **APA:**
165
+
166
+ More information needed
167
+
168
+ # Glossary [optional]
169
+
170
+ More information needed
171
+
172
+ # More Information [optional]
173
+ More information needed
174
+
175
+ # Model Card Authors [optional]
176
+
177
+ IBM in collaboration with Ezi Ozoani and the Hugging Face team
178
+
179
+ # Model Card Contact
180
+
181
+ More information needed
182
+
183
+ # How to Get Started with the Model
184
+
185
+ Use the code below to get started with the model.
186
+
187
+ <details>
188
+ <summary> Click to expand </summary>
189
+
190
+ ```python
191
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
192
+
193
+ tokenizer = AutoTokenizer.from_pretrained("ibm/qcpg-sentences")
194
+
195
+ model = AutoModelForSeq2SeqLM.from_pretrained("ibm/qcpg-sentences")
196
+
197
+
198
+ ```
199
+ </details>
200
+