|
--- |
|
language: |
|
- en |
|
tags: |
|
- gpt2 |
|
license: mit |
|
datasets: |
|
- wiki_qa |
|
|
|
inference: false |
|
|
|
|
|
--- |
|
## Description |
|
This Question-Answering model was fine-tuned & trained from a generative, left-to-right transformer in the style of GPT-2, the [distilgpt2](https://huggingface.co/distilgpt2) model. This model was trained on [Wiki-QA](https://huggingface.co/datasets/wiki_qa) dataset from Microsoft. |
|
|
|
# How to run XBOT-RK/Distil-GPT2-Wiki-QA using Transformers |
|
## Question-Answering |
|
|
|
The following code shows how to use the Distil-GPT2-Wiki-QA checkpoint and Transformers to generate Answers. |
|
```python |
|
from transformers import GPT2LMHeadModel, GPT2Tokenizer |
|
|
|
import torch |
|
import re |
|
|
|
tokenizer = GPT2Tokenizer.from_pretrained("XBOT-RK/distilgpt2-wiki-qa") |
|
model = GPT2LMHeadModel.from_pretrained("XBOT-RK/distilgpt2-wiki-qa") |
|
|
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
|
|
def infer(question): |
|
generated_tensor = model.generate(**tokenizer(question, return_tensors="pt").to(device), max_new_tokens = 50) |
|
generated_text = tokenizer.decode(generated_tensor[0]) |
|
return generated_text |
|
|
|
def processAnswer(question, result): |
|
answer = result.replace(question, '').strip() |
|
if "<bot>:" in answer: |
|
answer = re.search('<bot>:(.*)', answer).group(1).strip() |
|
if "<endofstring>" in answer: |
|
answer = re.search('(.*)<endofstring>', answer).group(1).strip() |
|
return answer |
|
|
|
question = "What is a tropical cyclone?" |
|
result = infer(question) |
|
answer = processAnswer(question, result) |
|
print('Question: ', question) |
|
print('Answer: ', answer) |
|
|
|
# Output |
|
|
|
"Question: What is a tropical cyclone?" |
|
"Answer: The cyclone is named after the climber Edmond Halley, who described it as the 'most powerful cyclone of the Atlantic'." |
|
|
|
``` |