MAGAer13 commited on
Commit
f6f086a
1 Parent(s): 3b53562

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md CHANGED
@@ -1,3 +1,65 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - en
5
+ pipeline_tag: image-to-text
6
+ tags:
7
+ - mplug-owl
8
  ---
9
+
10
+ # Usage
11
+ ## Get the latest codebase from Github
12
+ ```Bash
13
+ git clone https://github.com/X-PLUG/mPLUG-Owl.git
14
+ ```
15
+
16
+ ## Model initialization
17
+ ```Python
18
+ from mplug_owl.modeling_mplug_owl import MplugOwlForConditionalGeneration
19
+ from mplug_owl.tokenization_mplug_owl import MplugOwlTokenizer
20
+ from mplug_owl.processing_mplug_owl import MplugOwlImageProcessor, MplugOwlProcessor
21
+
22
+ pretrained_ckpt = 'MAGAer13/mplug-owl-llama-7b'
23
+ model = MplugOwlForConditionalGeneration.from_pretrained(
24
+ pretrained_ckpt,
25
+ torch_dtype=torch.bfloat16,
26
+ )
27
+ image_processor = MplugOwlImageProcessor.from_pretrained(pretrained_ckpt)
28
+ tokenizer = MplugOwlTokenizer.from_pretrained(pretrained_ckpt)
29
+ processor = MplugOwlProcessor(image_processor, tokenizer)
30
+ ```
31
+
32
+ ## Model inference
33
+ Prepare model inputs.
34
+ ```Python
35
+ # We use a human/AI template to organize the context as a multi-turn conversation.
36
+ # <image> denotes an image placehold.
37
+ prompts = [
38
+ '''The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
39
+ Human: <image>
40
+ Human: Explain why this meme is funny.
41
+ AI: ''']
42
+
43
+ # The image paths should be placed in the image_list and kept in the same order as in the prompts.
44
+ # We support urls, local file paths and base64 string. You can custom the pre-process of images by modifying the mplug_owl.modeling_mplug_owl.ImageProcessor
45
+ image_list = ['https://xxx.com/image.jpg']
46
+ ```
47
+
48
+ Get response.
49
+ ```Python
50
+ # generate kwargs (the same in transformers) can be passed in the do_generate()
51
+ generate_kwargs = {
52
+ 'do_sample': True,
53
+ 'top_k': 5,
54
+ 'max_length': 512
55
+ }
56
+ from PIL import Image
57
+ images = [Image.open(_) for _ in image_list]
58
+ inputs = processor(text=prompts, images=images, return_tensors='pt')
59
+ inputs = {k: v.bfloat16() if v.dtype == torch.float else v for k, v in inputs.items()}
60
+ inputs = {k: v.to(model.device) for k, v in inputs.items()}
61
+ with torch.no_grad():
62
+ res = model.generate(**inputs, **generate_kwargs)
63
+ sentence = tokenizer.decode(res.tolist()[0], skip_special_tokens=True)
64
+ print(sentence)
65
+ ```