groloch commited on
Commit
6dee0c5
1 Parent(s): 4f9c9ef

Added model card

Browse files
Files changed (1) hide show
  1. README.md +113 -3
README.md CHANGED
@@ -1,3 +1,113 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: ministral/Ministral-3b-instruct
3
+ library_name: peft
4
+ ---
5
+
6
+ # Ministral-3b-instruct-PromptEnhancing
7
+
8
+ Ministral-3b-instruct-PromptEnhancing is a LoRA-finetuned instruction-tuned text-generation model.
9
+
10
+ This model was releaszed alongside three other models in the 2-3b parameters range, all trained on the same dataset with the same training arguments.
11
+
12
+
13
+ ## Model Details
14
+
15
+ ### Model Description
16
+
17
+ This model is a LoRA fine-tune of [ministral/Ministral-3b-instruct](https://huggingface.co/ministral/Ministral-3b-instruct).
18
+ The goal of this finetune is to provide a light-weight prompt enhancing model for stable diffusion (or other diffusers sharing the same prompting conventions) to make image generation more accessible to everyone.
19
+
20
+
21
+
22
+ - **Developed by:** [groloch](https://huggingface.co/groloch)
23
+ - **Model type:** LoRA
24
+ - **Language(s) (NLP):** English
25
+ - **License:** [apache 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md)
26
+ - **Finetuned from model:** [ministral/Ministral-3b-instruct](https://huggingface.co/ministral/Ministral-3b-instruct).
27
+
28
+ ### Model Sources [optional]
29
+
30
+ - **Paper:** _Coming soon_
31
+ - **Demo:** _Coming soon_
32
+
33
+ ## Uses
34
+
35
+ This model should be used as a prompt-enhancing model for diffusers. To use it, the simplest is to try out at the official [demo](#) (_coming soon_).
36
+
37
+
38
+ ### Direct Use
39
+
40
+ If you want to use it locally, refer to the following code snippet:
41
+ ```python
42
+ import torch
43
+ from transformers import AutoTokenizer, AutoModelForCausalLM
44
+
45
+
46
+ base_repo_id = 'ministral/Ministral-3b-instruct'
47
+ adapter_repo_id = 'groloch/Ministral-3b-instruct-PromptEnhancing'
48
+
49
+ tokenizer = AutoTokenizer.from_pretrained(base_repo_id)
50
+ model = AutoModelForCausalLM.from_pretrained(base_repo_id, torch_dtype=torch.bfloat16).to('cuda')
51
+ model.load_adapter(adapter_repo_id)
52
+
53
+ prompt_to_enhance = 'Sinister crocodile eating a jolly rabbit'
54
+
55
+ chat = [
56
+ {'role' : 'user', 'content': prompt_to_enhance}
57
+ ]
58
+
59
+ prompt = tokenizer.apply_chat_template(chat,
60
+ tokenize=False,
61
+ add_generation_prompt=True,
62
+ return_tensors='pt')
63
+
64
+ encoding = tokenizer(prompt, return_tensors="pt").to('cuda')
65
+
66
+ generation_config = model.generation_config
67
+ generation_config.do_sample = True
68
+ generation_config.max_new_tokens = 96
69
+ generation_config.temperature = 0.3
70
+ generation_config.top_p = 0.7
71
+ generation_config.num_return_sequences = 1
72
+ generation_config.pad_token_id = tokenizer.eos_token_id
73
+ generation_config.eos_token_id = tokenizer.eos_token_id
74
+ generation_config.repetition_penalty = 2.0
75
+
76
+ with torch.inference_mode():
77
+ outputs = model.generate(
78
+ input_ids=encoding.input_ids,
79
+ attention_mask=encoding.attention_mask,
80
+ generation_config=generation_config
81
+ )
82
+
83
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
84
+
85
+ ```
86
+
87
+ ### Out-of-Scope Use
88
+
89
+ This model is meant to be used as a prompt enhancer. Inputs should be concise and not too detailed (no full prompts).
90
+
91
+ Using this model for other purposes may yield unexpected behavior.
92
+
93
+ ## Bias, Risks, and Limitations
94
+
95
+ This model was trained on a dataset partially generated by AI, which may contain bias.
96
+
97
+ This is a pretty lightweight model, so it may have significant limitations.
98
+
99
+ ### Recommendations
100
+
101
+ Use high repetition penalty (> 2.0) and low temperature (< 0.4) for generation. Do not generate more than 128 tokens.
102
+
103
+ ## Training Details
104
+
105
+ ### Training Data
106
+
107
+ This model was trained for one epoch on [groloch/stable_diffusion_prompts_instruct](https://huggingface.co/datasets/groloch/stable_diffusion_prompts_instruct).
108
+
109
+ ### Training Hyperparameters
110
+
111
+ _coming soon_
112
+
113
+ - PEFT 0.13.2