1.75 kB
--- | |
thumbnail: https://huggingface.co/front/thumbnails/dialogpt.png | |
tags: | |
- conversational | |
license: mit | |
--- | |
# DialoGPT Trained on WhatsApp chats | |
This is an instance of [microsoft/DialoGPT-medium](https://huggingface.co/microsoft/DialoGPT-medium) trained on WhatsApp chats or you can train this model on [a Kaggle game script dataset](https://www.kaggle.com/ruolinzheng/twewy-game-script). | |
feel free to ask me questions on discord server [discord server](https://discord.gg/Gqhje8Z7DX) | |
Chat with the model: | |
```python | |
from transformers import AutoTokenizer, AutoModelWithLMHead | |
tokenizer = AutoTokenizer.from_pretrained("harrydonni/DialoGPT-small-Michael-Scott") | |
model = AutoModelWithLMHead.from_pretrained("harrydonni/DialoGPT-small-Michael-Scott") | |
# Let's chat for 4 lines | |
for step in range(4): | |
# encode the new user input, add the eos_token and return a tensor in Pytorch | |
new_user_input_ids = tokenizer.encode(input(">> User:") + tokenizer.eos_token, return_tensors='pt') | |
# print(new_user_input_ids) | |
# append the new user input tokens to the chat history | |
bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids | |
# generated a response while limiting the total chat history to 1000 tokens, | |
chat_history_ids = model.generate( | |
bot_input_ids, max_length=200, | |
pad_token_id=tokenizer.eos_token_id, | |
no_repeat_ngram_size=3, | |
do_sample=True, | |
top_k=100, | |
top_p=0.7, | |
temperature=0.8 | |
) | |
# pretty print last ouput tokens from bot | |
print("Michael: {}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True))) | |
``` | |
this is done by shreesha thank you...... |