erikycd
/

chatbot_hadita

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

chatbot_hadita / README.md

erikycd's picture

Update README.md

226c80b about 1 year ago

|

raw history blame contribute delete

No virus

2.25 kB

	---
	license: gpl-3.0
	tags:
	- conversational
	- gpt2
	language:
	- es
	datasets:
	- open_subtitles
	widget:
	- text: Me gusta el deporte
	example_title: Interacción
	- text: Hola
	example_title: Saludo
	- text: ¿Como estas?
	example_title: Pregunta

	---

	# Spanish GPT-2 as backbone

	Fine-tuned model on Spanish language using [Opensubtitle](https://opus.nlpl.eu/OpenSubtitles-v2018.php) dataset. The original GPT-2
	model was used as backbone which has been trained from scratch on the Spanish portion of OSCAR dataset, according to the [Flax/Jax](https://huggingface.co/flax-community/gpt-2-spanish)
	Community by HuggingFace.

	## Model description and fine tunning

	First, the model used as backbone was the OpenAI's GPT-2, introduced in the paper "Language Models are Unsupervised Multitask Learners"
	by Alec Radford et al. Second, transfer learning approach with a large dataset in Spanish was used to transform the text generation model to
	conversational tasks. The use of special tokens plays a key role in the process of fine-tuning.

	```python
	tokenizer.add_special_tokens({"pad_token": "<pad>",
	"bos_token": "<startofstring>",
	"eos_token": "<endofstring>"})
	tokenizer.add_tokens(["<bot>:"])
	```

	## How to use

	You can use this model directly with a pipeline for auto model with casual LM:

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("erikycd/chatbot_hadita")
	model = AutoModelForCausalLM.from_pretrained("erikycd/chatbot_hadita")
	device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
	model = model.to(device)

	def infer(inp):
	inp = "<startofstring> "+ inp +" <bot>: "
	inp = tokenizer(inp, return_tensors = "pt")
	X = inp["input_ids"].to(device)
	attn = inp["attention_mask"].to(device)
	output = model.generate(X, attention_mask = attn, pad_token_id = tokenizer.eos_token_id)
	output = tokenizer.decode(output[0], skip_special_tokens = True)
	return output

	exit_commands = ('bye', 'quit')
	text = ''
	while text not in exit_commands:

	text = input('\nUser: ')
	output = infer(text)
	print('Bot: ', output)

	```