Reimu Hakurei commited on
Commit
520f0ff
1 Parent(s): cf64643

Create modelcard

Browse files
Files changed (1) hide show
  1. README.md +67 -0
README.md ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - pytorch
6
+ - causal-lm
7
+ license: mit
8
+
9
+ ---
10
+
11
+ # Lit-125M - A Small Fine-tuned Model For Fictional Storytelling
12
+
13
+ Lit-125M is a GPT-Neo 125M model fine-tuned on 2GB of a diverse range of light novels, erotica, and annotated literature for the purpose of generating novel-like fictional text.
14
+
15
+ ## Model Description
16
+
17
+ The model used for fine-tuning is [GPT-Neo 125M](https://huggingface.co/EleutherAI/gpt-neo-125M), which is a 125 million parameter auto-regressive language model trained on [The Pile](https://pile.eleuther.ai/)..
18
+
19
+ ## Training Data & Annotative Prompting
20
+
21
+ The data used in fine-tuning has been gathered from various sources such as the [Gutenberg Project](https://www.gutenberg.org/). The annotated fiction dataset has prepended tags to assist in generating towards a particular style. Here is an example prompt that shows how to use the annotations.
22
+
23
+ ```
24
+ [ Title: The Dunwich Horror; Author: H. P. Lovecraft; Genre: Horror; Tags: 3rdperson, scary; Style: Dark ]
25
+ ***
26
+ When a traveler in north central Massachusetts takes the wrong fork...
27
+ ```
28
+
29
+ The annotations can be mixed and matched to help generate towards a specific style.
30
+
31
+ ## Downstream Uses
32
+
33
+ This model can be used for entertainment purposes and as a creative writing assistant for fiction writers. The small size of the model can also help for easy debugging or further development of other models with a similar purpose.
34
+
35
+ ## Example Code
36
+
37
+ ```
38
+ from transformers import AutoTokenizer, AutoModelForCausalLM
39
+
40
+ model = AutoModelForCausalLM.from_pretrained('hakurei/lit-125M')
41
+ tokenizer = AutoTokenizer.from_pretrained('hakurei/lit-125M')
42
+
43
+ prompt = '''[ Title: The Dunwich Horror; Author: H. P. Lovecraft; Genre: Horror ]
44
+ ***
45
+ When a traveler'''
46
+
47
+ input_ids = tokenizer.encode(prompt, return_tensors='pt')
48
+ output = model.generate(input_ids, do_sample=True, temperature=1.0, top_p=0.9, repetition_penalty=1.2, max_length=len(input_ids[0])+100, pad_token_id=tokenizer.eos_token_id)
49
+
50
+ generated_text = tokenizer.decode(output[0])
51
+ print(generated_text)
52
+ ```
53
+
54
+ An example output from this code produces a result that will look similar to:
55
+
56
+ ```
57
+ [ Title: The Dunwich Horror; Author: H. P. Lovecraft; Genre: Horror ]
58
+ ***
59
+ When a traveler takes a trip through the streets of the world, the traveler feels like a youkai with a whole world inside her mind. It can be very scary for a youkai. When someone goes in the opposite direction and knocks on your door, it is actually the first time you have ever come to investigate something like that.
60
+ That's right: everyone has heard stories about youkai, right? If you have heard them, you know what I'm talking about.
61
+ It's hard not to say you
62
+ ```
63
+
64
+ ## Team members and Acknowledgements
65
+
66
+ - [Anthony Mercurio](https://github.com/harubaru)
67
+ - Imperishable_NEET