Transformers
GGUF
English
Inference Endpoints
marianna13 commited on
Commit
19046ef
1 Parent(s): 2d24a17

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ library_name: transformers
6
+ datasets:
7
+ - liuhaotian/LLaVA-Instruct-150K
8
+ - liuhaotian/LLaVA-Pretrain
9
+ ---
10
+
11
+ # Model Card for LLaVa-Phi-2-3B-GGUF
12
+
13
+ <!-- Provide a quick summary of what the model is/does. -->
14
+
15
+ ## Model Details
16
+
17
+ ### Model Description
18
+
19
+ <!-- Provide a longer summary of what this model is. -->
20
+
21
+ Quantized version of [llava-phi-2-3b](https://huggingface.co/marianna13/llava-phi-2-3b). Quantization was done using [llama.cpp](https://github.com/ggerganov/llama.cpp/tree/master/examples/llava)
22
+
23
+
24
+ - **Developed by:** [LAION](https://laion.ai/), [SkunkworksAI](https://huggingface.co/SkunkworksAI) & [Ontocord](https://www.ontocord.ai/)
25
+ - **Model type:** LLaVA is an open-source chatbot trained by fine-tuning Phi-2 on GPT-generated multimodal instruction-following data.
26
+ It is an auto-regressive language model, based on the transformer architecture
27
+ - **Finetuned from model:** [Phi-2](https://huggingface.co/microsoft/phi-2)
28
+ - **License:** MIT
29
+
30
+ ### Model Sources
31
+
32
+ <!-- Provide the basic links for the model. -->
33
+
34
+ - **Repository:** [BakLLaVa](https://github.com/SkunkworksAI/BakLLaVA)
35
+ - **LLama.cpp:** [GitHub](https://github.com/ggerganov/llama.cpp)
36
+
37
+ ## Usage
38
+
39
+ ```
40
+ make & ./llava-cli -m ../ggml-model-f16.gguf --mmproj ../mmproj-model-f16.gguf --image /path/to/image.jpg
41
+ ```
42
+
43
+ ## Evaluation
44
+
45
+ <!-- This section describes the evaluation protocols and provides the results. -->
46
+
47
+ ### Benchmarks
48
+
49
+ | Model | Parameters |SQA | GQA | TextVQA | POPE |
50
+ | --- | --- | --- | --- | --- | --- |
51
+ | [LLaVA-1.5](https://huggingface.co/liuhaotian/llava-v1.5-7b) | 7.3B | 68.0| **62.0** | **58.3** | 85.3 |
52
+ | [MC-LLaVA-3B](https://huggingface.co/visheratin/MC-LLaVA-3b) | 3B | - | 49.6 | 38.59 | - |
53
+ | [LLaVA-Phi](https://arxiv.org/pdf/2401.02330.pdf) | 3B | 68.4 | - | 48.6 | 85.0 |
54
+ | [moondream1](https://huggingface.co/vikhyatk/moondream1) | 1.6B | - | 56.3 | 39.8 | - |
55
+ | **llava-phi-2-3b** | 3B | **69.0** | 51.2 | 47.0 | **86.0** |
56
+
57
+ ### Image Captioning (MS COCO)
58
+
59
+ | Model | BLEU_1 | BLEU_2 | BLEU_3 | BLEU_4 | METEOR | ROUGE_L | CIDEr | SPICE |
60
+ | -------------------------------------------------------- | ------ | ------ | ------ | ------ | ------ | ------- | ----- | ----- |
61
+ | llava-1.5-7b | 75.8 | 59.8 | 45 | 33.3 | 29.4 | 57.7 | 108.8 | 23.5 |
62
+ | **llava-phi-2-3b** | 67.7 | 50.5 | 35.7 | 24.2 | 27.0 | 52.4 | 85.0 | 20.7 |