SE6446 commited on
Commit
01a8bc0
1 Parent(s): 5f6694f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -3
README.md CHANGED
@@ -8,8 +8,45 @@ metrics:
8
  - wer
9
  pipeline_tag: image-to-text
10
  ---
11
- # Untitled7
12
 
13
- This model was loveingly named after the Colab notebook that made it.
14
 
15
- It is supposed to read images and extract a stable diffusion prompt from it but, it might not do a good job at it.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  - wer
9
  pipeline_tag: image-to-text
10
  ---
11
+ # Untitled7-colab_checkpoint
12
 
13
+ This model was lovingly named after the Google Colab notebook that made it. It is a finetune of Microsoft's [git-large-coco](https://huggingface.co/microsoft/git-large-coco) model on the 1k subset of [poloclub/diffusiondb](https://huggingface.co/datasets/poloclub/diffusiondb/viewer/2m_first_1k/train).
14
 
15
+ It is supposed to read images and extract a stable diffusion prompt from it but, it might not do a good job at it. I wouldn't know I haven't extensivly tested it.
16
+
17
+ As the title suggests this is a checkpoint as I formerly intended to do it on the entire dataset but, I'm unsure if I want to now...
18
+ ## Intended use
19
+
20
+ Fun!
21
+
22
+ ```python
23
+ # Load model directly
24
+ from transformers import AutoProcessor, AutoModelForCausalLM
25
+
26
+ processor = AutoProcessor.from_pretrained("SE6446/Untitled7-colab_checkpoint")
27
+ model = AutoModelForCausalLM.from_pretrained("SE6446/Untitled7-colab_checkpoint")
28
+
29
+ #################################################################
30
+ # Use a pipeline as a high-level helper
31
+ from transformers import pipeline
32
+
33
+ pipe = pipeline("image-to-text", model="SE6446/Untitled7-colab_checkpoint")
34
+ ```
35
+
36
+ ## Out-of-scope use
37
+
38
+ Don't use this model to discriminate, alienate or in any other way harm/harass individuals. You guys know the drill...
39
+
40
+ ## Bias, Risks and, Limitations
41
+
42
+ This model does not produce accurate prompts, this is merely a bit of fun (and waste of funds). However it can suffer from bias present in the orginal git-large-coco model.
43
+
44
+ ## Training
45
+ *I.e boring stuff*
46
+
47
+ - lr = 5e-5
48
+ - epochs = 150
49
+ - optim = adamw
50
+ - fp16
51
+
52
+ If you want to further finetune it then you should freeze the embedding and vision tranformer layers