Update README.md
Browse files
README.md
CHANGED
@@ -2,6 +2,15 @@
|
|
2 |
|
3 |
_NOTE: this model card is a WIP_
|
4 |
|
5 |
-
GPT2-L (774M parameters)
|
6 |
|
7 |
-
Designed for use with [ai-msgbot](https://github.com/pszemraj/ai-msgbot).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
|
3 |
_NOTE: this model card is a WIP_
|
4 |
|
5 |
+
GPT2-L (774M parameters) fine-tuned on the Wizard of Wikipedia dataset for 40k steps with 34/36 layers frozen using `aitextgen`. This model was then subsequently further fine-tuned on the [Daily Dialogues](http://yanran.li/dailydialog) dataset for an additional 40k steps, this time with **35** of 36 layers frozen.
|
6 |
|
7 |
+
Designed for use with [ai-msgbot](https://github.com/pszemraj/ai-msgbot).
|
8 |
+
|
9 |
+
|
10 |
+
## conversation data
|
11 |
+
|
12 |
+
The dataset was tokenized and fed to the model as a conversation between two speakers, whose names are below. this is relevant for writing prompts and filtering/extracting text from responses.
|
13 |
+
|
14 |
+
`script_speaker_name` = `person alpha`
|
15 |
+
|
16 |
+
`script_responder_name` = `person beta`
|