pangpang666
commited on
Commit
•
8976b60
1
Parent(s):
0c91e7c
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,65 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
This model provides a GPT-2 language model trained with SimCTG on the English Wikipedia based on our paper [_A Contrastive Framework for Neural Text Generation_](https://arxiv.org/abs/2202.06417).
|
2 |
+
|
3 |
+
We provide a detailed tutorial on how to apply SimCTG and Contrastive Search in our [project repo](https://github.com/yxuansu/SimCTG#4-huggingface-style-tutorials-back-to-top). In the following, we illustrate a brief tutorial on how to use our approach to perform text generation.
|
4 |
+
|
5 |
+
## 1. Installation of SimCTG:
|
6 |
+
```yaml
|
7 |
+
pip install simctg --upgrade
|
8 |
+
```
|
9 |
+
|
10 |
+
## 2. Initialize SimCTG Model:
|
11 |
+
```python
|
12 |
+
import torch
|
13 |
+
# load SimCTG language model
|
14 |
+
from simctg.simctggpt import SimCTGGPT
|
15 |
+
model_name = r'cambridgeltl/simctg_english_wikipedia'
|
16 |
+
model = SimCTGGPT(model_name)
|
17 |
+
model.eval()
|
18 |
+
tokenizer = model.tokenizer
|
19 |
+
```
|
20 |
+
|
21 |
+
## 3. Prepare the Text Prefix:
|
22 |
+
```python
|
23 |
+
prefix_text = r"Insect farming is the practice of raising and breeding insects as livestock, also referred to as minilivestock or micro stock. Insects may be farmed for the commodities"
|
24 |
+
print ('Prefix is: {}'.format(prefix_text))
|
25 |
+
tokens = tokenizer.tokenize(prefix_text)
|
26 |
+
input_ids = tokenizer.convert_tokens_to_ids(tokens)
|
27 |
+
input_ids = torch.LongTensor(input_ids).view(1,-1)
|
28 |
+
```
|
29 |
+
|
30 |
+
## 4. Generate Text with Contrastive Search:
|
31 |
+
```python
|
32 |
+
beam_width, alpha, decoding_len = 5, 0.6, 128
|
33 |
+
output = model.fast_contrastive_search(input_ids=input_ids, beam_width=beam_width,
|
34 |
+
alpha=alpha, decoding_len=decoding_len)
|
35 |
+
print("Output:\n" + 100 * '-')
|
36 |
+
print(tokenizer.decode(output))
|
37 |
+
'''
|
38 |
+
Prefix is: Insect farming is the practice of raising and breeding insects as livestock, also referred to as minilivestock or
|
39 |
+
micro stock. Insects may be farmed for the commodities
|
40 |
+
Output:
|
41 |
+
----------------------------------------------------------------------------------------------------
|
42 |
+
Insect farming is the practice of raising and breeding insects as livestock, also referred to as minilivestock or micro stock.
|
43 |
+
Insects may be farmed for the commodities they produce, such as honey, corn, sorghum, and other crops. In some cases, the
|
44 |
+
production of insects is a way to increase income for the owner or his family. This type of farming has been described as "an
|
45 |
+
economic system that benefits all people regardless of race, sex, or social status" (p. 9). A large number of farmers in North
|
46 |
+
America, Europe, and South America have used the method of farming for food production in order to feed their families and livestock.
|
47 |
+
The most common method of farming is by hand-cropping, which consists of cutting a hole in the ground and using a saw
|
48 |
+
'''
|
49 |
+
```
|
50 |
+
|
51 |
+
For more details of our work, please refer to our main [project repo](https://github.com/yxuansu/SimCTG).
|
52 |
+
|
53 |
+
## 5. Citation:
|
54 |
+
If you find our paper and resources useful, please kindly leave a star and cite our paper. Thanks!
|
55 |
+
|
56 |
+
```bibtex
|
57 |
+
@article{su2022contrastive,
|
58 |
+
title={A Contrastive Framework for Neural Text Generation},
|
59 |
+
author={Su, Yixuan and Lan, Tian and Wang, Yan and Yogatama, Dani and Kong, Lingpeng and Collier, Nigel},
|
60 |
+
journal={arXiv preprint arXiv:2202.06417},
|
61 |
+
year={2022}
|
62 |
+
}
|
63 |
+
```
|
64 |
+
|
65 |
+
|