# ai-msgbot GPT2-L

_NOTE: model card is WIP_

GPT2-XL (~1.5 B parameters) trained on [the Wizard of Wikipedia dataset](https://parl.ai/projects/wizard_of_wikipedia/) for 40k steps with **33**/36 layers frozen using `aitextgen`. 


Designed for use with [ai-msgbot](https://github.com/pszemraj/ai-msgbot) to create an open-ended chatbot (of course, if other use cases arise, have at it).


## conversation data

The dataset was tokenized and fed to the model as a conversation between two speakers, whose names are below. This is relevant for writing prompts and filtering/extracting text from responses.

`script_speaker_name` = `person alpha`

`script_responder_name` = `person beta`

## examples

- the default inference API examples should work _okay_
- an ideal test would be explicitly adding `person beta` into the prompt text the model is forced to respond to instead of adding onto the entered prompt.