File size: 2,252 Bytes
27b5e4b 155c044 ddd3cae ce3b6f3 934e88d ddd3cae ce3b6f3 97c4ca4 226c80b 51ce842 cd1508e 226c80b 27b5e4b 934e88d 5323a2b 934e88d b78749f 934e88d 5323a2b 934e88d 5323a2b 934e88d 5323a2b 9c1bf4f 5323a2b 9c1bf4f 5323a2b 3117354 9c1bf4f 9373bec 9c1bf4f 3117354 9c1bf4f ce3b6f3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
---
license: gpl-3.0
tags:
- conversational
- gpt2
language:
- es
datasets:
- open_subtitles
widget:
- text: Me gusta el deporte
example_title: Interacción
- text: Hola
example_title: Saludo
- text: ¿Como estas?
example_title: Pregunta
---
# Spanish GPT-2 as backbone
Fine-tuned model on Spanish language using [Opensubtitle](https://opus.nlpl.eu/OpenSubtitles-v2018.php) dataset. The original GPT-2
model was used as backbone which has been trained from scratch on the Spanish portion of OSCAR dataset, according to the [Flax/Jax](https://huggingface.co/flax-community/gpt-2-spanish)
Community by HuggingFace.
## Model description and fine tunning
First, the model used as backbone was the OpenAI's GPT-2, introduced in the paper "Language Models are Unsupervised Multitask Learners"
by Alec Radford et al. Second, transfer learning approach with a large dataset in Spanish was used to transform the text generation model to
conversational tasks. The use of special tokens plays a key role in the process of fine-tuning.
```python
tokenizer.add_special_tokens({"pad_token": "<pad>",
"bos_token": "<startofstring>",
"eos_token": "<endofstring>"})
tokenizer.add_tokens(["<bot>:"])
```
## How to use
You can use this model directly with a pipeline for auto model with casual LM:
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("erikycd/chatbot_hadita")
model = AutoModelForCausalLM.from_pretrained("erikycd/chatbot_hadita")
device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
model = model.to(device)
def infer(inp):
inp = "<startofstring> "+ inp +" <bot>: "
inp = tokenizer(inp, return_tensors = "pt")
X = inp["input_ids"].to(device)
attn = inp["attention_mask"].to(device)
output = model.generate(X, attention_mask = attn, pad_token_id = tokenizer.eos_token_id)
output = tokenizer.decode(output[0], skip_special_tokens = True)
return output
exit_commands = ('bye', 'quit')
text = ''
while text not in exit_commands:
text = input('\nUser: ')
output = infer(text)
print('Bot: ', output)
``` |