|
This model provides a GPT-2 language model trained with SimCTG on the English Wikipedia based on our paper [_A Contrastive Framework for Neural Text Generation_](https://arxiv.org/abs/2202.06417). |
|
|
|
We provide a detailed tutorial on how to apply SimCTG and Contrastive Search in our [project repo](https://github.com/yxuansu/SimCTG#4-huggingface-style-tutorials-back-to-top). In the following, we illustrate a brief tutorial on how to use our approach to perform text generation. |
|
|
|
## 1. Installation of SimCTG: |
|
```yaml |
|
pip install simctg --upgrade |
|
``` |
|
|
|
## 2. Initialize SimCTG Model: |
|
```python |
|
import torch |
|
# load SimCTG language model |
|
from simctg.simctggpt import SimCTGGPT |
|
model_name = r'cambridgeltl/simctg_english_wikipedia' |
|
model = SimCTGGPT(model_name) |
|
model.eval() |
|
tokenizer = model.tokenizer |
|
``` |
|
|
|
## 3. Prepare the Text Prefix: |
|
```python |
|
prefix_text = r"Insect farming is the practice of raising and breeding insects as livestock, also referred to as minilivestock or micro stock. Insects may be farmed for the commodities" |
|
print ('Prefix is: {}'.format(prefix_text)) |
|
tokens = tokenizer.tokenize(prefix_text) |
|
input_ids = tokenizer.convert_tokens_to_ids(tokens) |
|
input_ids = torch.LongTensor(input_ids).view(1,-1) |
|
``` |
|
|
|
## 4. Generate Text with Contrastive Search: |
|
```python |
|
beam_width, alpha, decoding_len = 5, 0.6, 128 |
|
output = model.fast_contrastive_search(input_ids=input_ids, beam_width=beam_width, |
|
alpha=alpha, decoding_len=decoding_len) |
|
print("Output:\n" + 100 * '-') |
|
print(tokenizer.decode(output)) |
|
''' |
|
Prefix is: Insect farming is the practice of raising and breeding insects as livestock, also referred to as minilivestock or |
|
micro stock. Insects may be farmed for the commodities |
|
Output: |
|
---------------------------------------------------------------------------------------------------- |
|
Insect farming is the practice of raising and breeding insects as livestock, also referred to as minilivestock or micro stock. |
|
Insects may be farmed for the commodities they produce, such as honey, corn, sorghum, and other crops. In some cases, the |
|
production of insects is a way to increase income for the owner or his family. This type of farming has been described as "an |
|
economic system that benefits all people regardless of race, sex, or social status" (p. 9). A large number of farmers in North |
|
America, Europe, and South America have used the method of farming for food production in order to feed their families and livestock. |
|
The most common method of farming is by hand-cropping, which consists of cutting a hole in the ground and using a saw |
|
''' |
|
``` |
|
|
|
For more details of our work, please refer to our main [project repo](https://github.com/yxuansu/SimCTG). |
|
|
|
## 5. Citation: |
|
If you find our paper and resources useful, please kindly leave a star and cite our paper. Thanks! |
|
|
|
```bibtex |
|
@article{su2022contrastive, |
|
title={A Contrastive Framework for Neural Text Generation}, |
|
author={Su, Yixuan and Lan, Tian and Wang, Yan and Yogatama, Dani and Kong, Lingpeng and Collier, Nigel}, |
|
journal={arXiv preprint arXiv:2202.06417}, |
|
year={2022} |
|
} |
|
``` |
|
|
|
|
|
|