vikhyatk commited on
Commit
dffbe5f
1 Parent(s): b3d72d1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -0
README.md CHANGED
@@ -1,3 +1,35 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ moondream2 is a small vision language model designed to run efficiently on edge devices. Check out the [GitHub repository](https://github.com/vikhyat/moondream) for details.
6
+
7
+ **Benchmarks**
8
+
9
+ | Release | VQAv2 | GQA | TextVQA | POPE | TallyQA |
10
+ | --- | --- | --- | --- | --- | --- |
11
+ | **2024-03-04** (latest) | 74.2 | 58.5 | 36.4 | (coming soon) | (coming soon) |
12
+
13
+ **Usage**
14
+
15
+ ```bash
16
+ pip install transformers timm einops
17
+ ```
18
+
19
+ ```python
20
+ from transformers import AutoModelForCausalLM, AutoTokenizer
21
+ from PIL import Image
22
+
23
+ model_id = "vikhyatk/moondream2"
24
+ model = AutoModelForCausalLM.from_pretrained(
25
+ model_id, trust_remote_code=True, revision="2024-03-04"
26
+ )
27
+ tokenizer = AutoTokenizer.from_pretrained(model_id, revision="2024-03-04")
28
+
29
+ image = Image.open('<IMAGE_PATH>')
30
+ enc_image = model.encode_image(image)
31
+ print(model.answer_question(enc_image, "Describe this image.", tokenizer))
32
+ ```
33
+
34
+ The model is updated regularly, so we recommend pinning the model version to a
35
+ specific release as shown above.