ethzanalytics
/

ai-msgbot-gpt2-XL

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

pszemraj commited on Dec 26, 2021

Commit

47497aa

•

1 Parent(s): 948094e

Create README.md

Files changed (1) hide show

README.md +22 -0

README.md ADDED Viewed

	@@ -0,0 +1,22 @@

+# ai-msgbot GPT2-L
+_NOTE: model card is WIP_
+GPT2-XL (~1.5 B parameters) trained on [the Wizard of Wikipedia dataset](https://parl.ai/projects/wizard_of_wikipedia/) for 40k steps with **33**/36 layers frozen using `aitextgen`.
+Designed for use with [ai-msgbot](https://github.com/pszemraj/ai-msgbot) to create an open-ended chatbot (of course, if other use cases arise, have at it).
+## conversation data
+The dataset was tokenized and fed to the model as a conversation between two speakers, whose names are below. This is relevant for writing prompts and filtering/extracting text from responses.
+`script_speaker_name` = `person alpha`
+`script_responder_name` = `person beta`
+## examples
+- the default inference API examples should work _okay_
+- an ideal test would be explicitly adding `person beta` into the prompt text the model is forced to respond to instead of adding onto the entered prompt.