Image-Text-to-Text
Transformers
Safetensors
English
Chinese
llava
pretraining
vision-language
llm
lmm
Inference Endpoints
bczhou commited on
Commit
e5aaeb2
1 Parent(s): 507d5f6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -19,7 +19,10 @@ We have evaluated TinyLLaVA on [GQA](https://cs.stanford.edu/people/dorarad/gqa/
19
 
20
  | Model | VQAv2 | GQA | SQA | TextVQA | VizWiz |
21
  | -------------------- | :------------: | :------------: | :------------: | :------------: | :------------: |
22
- | TinyLLaVA-v1-1.4B | 73.41 | 57.54 | 59.40 | 46.37 | 49.56 |
 
 
 
23
  | BLIP-2 | 41.00 | 41.00 | 61.00 | 42.50 | 19.60 |
24
  | LLaVA-v1.5-7B | 78.50 | 62.00 | 66.80 | 61.3 | 50 |
25
  | LLaVA-v1.5-13B | 80.00 | 63.30 | 71.60 | 61.3 | 53.6 |
 
19
 
20
  | Model | VQAv2 | GQA | SQA | TextVQA | VizWiz |
21
  | -------------------- | :------------: | :------------: | :------------: | :------------: | :------------: |
22
+ | TinyLLaVA-v1-tinyllama | 73.41 | 57.54 | 59.40 | 46.37 | |
23
+ | TinyLLaVA-v1-stablelm | 74.9 | 58.86 | 62.82 | 49.52 | 35.6 |
24
+ | TinyLLaVA-v1.1-tinyllama| 75.24 | 59.43 | 58.80 | 48.05 | 34.74 |
25
+ | TinyLLaVA-v1.1-stablelm| 76.34 | 60.26 | 63.06 | 51.6 | 36.34 |
26
  | BLIP-2 | 41.00 | 41.00 | 61.00 | 42.50 | 19.60 |
27
  | LLaVA-v1.5-7B | 78.50 | 62.00 | 66.80 | 61.3 | 50 |
28
  | LLaVA-v1.5-13B | 80.00 | 63.30 | 71.60 | 61.3 | 53.6 |