|
--- |
|
license: gpl-3.0 |
|
tags: |
|
- conversational |
|
- gpt2 |
|
language: |
|
- es |
|
datasets: |
|
- open_subtitles |
|
widget: |
|
- text: Me gusta el deporte |
|
example_title: Interacción |
|
- text: Hola |
|
example_title: Saludo |
|
- text: ¿Como estas? |
|
example_title: Pregunta |
|
|
|
--- |
|
|
|
# Spanish GPT-2 as backbone |
|
|
|
Fine-tuned model on Spanish language using [Opensubtitle](https://opus.nlpl.eu/OpenSubtitles-v2018.php) dataset. The original GPT-2 |
|
model was used as backbone which has been trained from scratch on the Spanish portion of OSCAR dataset, according to the [Flax/Jax](https://huggingface.co/flax-community/gpt-2-spanish) |
|
Community by HuggingFace. |
|
|
|
## Model description and fine tunning |
|
|
|
First, the model used as backbone was the OpenAI's GPT-2, introduced in the paper "Language Models are Unsupervised Multitask Learners" |
|
by Alec Radford et al. Second, transfer learning approach with a large dataset in Spanish was used to transform the text generation model to |
|
conversational tasks. The use of special tokens plays a key role in the process of fine-tuning. |
|
|
|
```python |
|
tokenizer.add_special_tokens({"pad_token": "<pad>", |
|
"bos_token": "<startofstring>", |
|
"eos_token": "<endofstring>"}) |
|
tokenizer.add_tokens(["<bot>:"]) |
|
``` |
|
|
|
## How to use |
|
|
|
You can use this model directly with a pipeline for auto model with casual LM: |
|
|
|
```python |
|
import torch |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("erikycd/chatbot_hadita") |
|
model = AutoModelForCausalLM.from_pretrained("erikycd/chatbot_hadita") |
|
device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu" |
|
model = model.to(device) |
|
|
|
def infer(inp): |
|
inp = "<startofstring> "+ inp +" <bot>: " |
|
inp = tokenizer(inp, return_tensors = "pt") |
|
X = inp["input_ids"].to(device) |
|
attn = inp["attention_mask"].to(device) |
|
output = model.generate(X, attention_mask = attn, pad_token_id = tokenizer.eos_token_id) |
|
output = tokenizer.decode(output[0], skip_special_tokens = True) |
|
return output |
|
|
|
exit_commands = ('bye', 'quit') |
|
text = '' |
|
while text not in exit_commands: |
|
|
|
text = input('\nUser: ') |
|
output = infer(text) |
|
print('Bot: ', output) |
|
|
|
``` |