File size: 2,663 Bytes
7def7ab
 
 
 
 
 
 
df5bb4b
a8f2ce4
74c0696
e103bfb
5c226eb
 
 
63d1238
08cedce
 
21d641a
63d1238
 
6a1e0fa
3bdbb13
63d1238
0a050cc
 
 
21d641a
5c226eb
 
 
3bdbb13
0a050cc
 
 
 
 
 
 
 
 
 
 
 
2b2377c
 
43a2e23
 
 
2b2377c
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
---
license: mit
tags:
- unsloth
- opensource
- phi
---

<img src="https://cloud-3i4ld6u5y-hack-club-bot.vercel.app/0home.png" alt="Akash Network logo" width="200"/>

Thank you to the [Akash Network](https://akash.network/) for sponsoring this project and providing A100s/H100s for compute!
<a target="_blank" href="https://colab.research.google.com/github/andrewgcodes/autoprompter/blob/main/run_autoprompter.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>
# Overview
Writing good AI art prompts for Midjourney, Stable Diffusion, etcetera takes time and practice. I fine-tuned a small language model (Phi-3) to help you improve your prompts.

Fine-tuned version of unquantized [unsloth/Phi-3-mini-4k-instruct](https://huggingface.co/unsloth/Phi-3-mini-4k-instruct) using Unsloth on ~100,000 high quality Midjourney AI art prompts

This Hugging Face repo contains adapter weights only (you need to load on top of the base Phi-3 weights)
<img src="https://cloud-5rbe0uczw-hack-club-bot.vercel.app/0screenshot_2024-05-20_at_3.30.18___pm.png" alt="prompt format" width="400"/>

# Inference
Recommended inference settings: repetition_penalty = 1.2, temperature = 0.5-1.0
Adjust if the model starts repeating itself.

To run (Colab T4 GPU works):
<a target="_blank" href="https://colab.research.google.com/github/andrewgcodes/autoprompter/blob/main/run_autoprompter.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Fine-tuning Details
I used this [reference code](https://medium.com/@mauryaanoop3/fine-tuning-phi-3-with-unsloth-for-superior-performance-on-custom-data-2c14b3c1e90b) from Anoop Maurya.
After experimenting with various settings and parameters, this is what I settled on:
- Max seq length: 128 (few prompts are longer than 128 tokens, at this point you probably get diminishing returns on image quality)
- fine-tuned on unquantized base weights
- LoRA: R=32, Alpha=32
- Batch size: 32
- Epochs: 1
- Gradient accumulation steps: 4
- Warmup steps: 100
- Learning rate: 1e4
- Optimizer: adamw_8bit
I used an H100 GPU from the Akash Network.

# Dataset
Please see [gaodrew/midjourney-prompts-highquality](https://huggingface.co/datasets/gaodrew/midjourney-prompts-highquality)

# Limitations
The model is prone to listing out adjectives. For example: "kitten, cute kitten with a big smile on its face and fluffy fur. The background is filled with colorful flowers in various shades of pink, purple, blue, yellow, orange, green, red, white, black, brown, gray, gold, silver, bronze, copper, brass, steel, aluminum, titanium, platinum, diamond"