bczhou
/

tiny-llava-v1-hf

Image-Text-to-Text

vision-language

Inference Endpoints

Model card Files Files and versions Community

bczhou commited on Feb 5

Commit

e5aaeb2

•

1 Parent(s): 507d5f6

Update README.md

Files changed (1) hide show

README.md +4 -1

README.md CHANGED Viewed

@@ -19,7 +19,10 @@ We have evaluated TinyLLaVA on [GQA](https://cs.stanford.edu/people/dorarad/gqa/
 |   Model   |     VQAv2      |      GQA       |       SQA      |      TextVQA   |      VizWiz    |
 | -------------------- | :------------: | :------------: | :------------: | :------------: | :------------: |
-|   TinyLLaVA-v1-1.4B  |      73.41     |     57.54      |     59.40      |     46.37      |      49.56      |
 |   BLIP-2             |      41.00     |     41.00      |     61.00      |     42.50      |      19.60      |
 |   LLaVA-v1.5-7B      |      78.50     |     62.00      |     66.80      |     61.3      |      50      |
 |   LLaVA-v1.5-13B     |      80.00     |     63.30      |     71.60      |     61.3      |      53.6      |

 |   Model   |     VQAv2      |      GQA       |       SQA      |      TextVQA   |      VizWiz    |
 | -------------------- | :------------: | :------------: | :------------: | :------------: | :------------: |
+|   TinyLLaVA-v1-tinyllama  |      73.41     |     57.54      |     59.40      |     46.37      |            |
+| TinyLLaVA-v1-stablelm |    74.9       |     58.86      |     62.82      |     49.52      |      35.6       |
+| TinyLLaVA-v1.1-tinyllama|    75.24    |     59.43      |     58.80      |     48.05      |      34.74      |
+| TinyLLaVA-v1.1-stablelm|    76.34     |     60.26      |     63.06      |      51.6      |     36.34       |
 |   BLIP-2             |      41.00     |     41.00      |     61.00      |     42.50      |      19.60      |
 |   LLaVA-v1.5-7B      |      78.50     |     62.00      |     66.80      |     61.3      |      50      |
 |   LLaVA-v1.5-13B     |      80.00     |     63.30      |     71.60      |     61.3      |      53.6      |