Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,22 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# ai-msgbot GPT2-L
|
2 |
+
|
3 |
+
_NOTE: model card is WIP_
|
4 |
+
|
5 |
+
GPT2-XL (~1.5 B parameters) trained on [the Wizard of Wikipedia dataset](https://parl.ai/projects/wizard_of_wikipedia/) for 40k steps with **33**/36 layers frozen using `aitextgen`.
|
6 |
+
|
7 |
+
|
8 |
+
Designed for use with [ai-msgbot](https://github.com/pszemraj/ai-msgbot) to create an open-ended chatbot (of course, if other use cases arise, have at it).
|
9 |
+
|
10 |
+
|
11 |
+
## conversation data
|
12 |
+
|
13 |
+
The dataset was tokenized and fed to the model as a conversation between two speakers, whose names are below. This is relevant for writing prompts and filtering/extracting text from responses.
|
14 |
+
|
15 |
+
`script_speaker_name` = `person alpha`
|
16 |
+
|
17 |
+
`script_responder_name` = `person beta`
|
18 |
+
|
19 |
+
## examples
|
20 |
+
|
21 |
+
- the default inference API examples should work _okay_
|
22 |
+
- an ideal test would be explicitly adding `person beta` into the prompt text the model is forced to respond to instead of adding onto the entered prompt.
|