Text Generation
Transformers
PyTorch
English
llava
Inference Endpoints
SpursgoZmy commited on
Commit
16d48ca
1 Parent(s): 78aa4f1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -23,7 +23,7 @@ See the ACL 2024 paper for more details: [Multimodal Table Understanding](https:
23
 
24
  <!-- Provide a longer summary of what this model is. -->
25
 
26
- **Model Type:** Table LLaVA strictly follows the [LLaVA-v1.5](https://arxiv.org/abs/2310.03744) model architecture and training pipeline,
27
  with [CLIP-ViT-L-336px](https://huggingface.co/openai/clip-vit-large-patch14-336) as visual encoder (336*336 image resolution),
28
  [Vicuna-v1.5-13B](https://huggingface.co/lmsys/vicuna-13b-v1.5) as base LLM and a two-layer MLP as vision-language connector.
29
 
 
23
 
24
  <!-- Provide a longer summary of what this model is. -->
25
 
26
+ **Model Type:** Table LLaVA 13B strictly follows the [LLaVA-v1.5](https://arxiv.org/abs/2310.03744) model architecture and training pipeline,
27
  with [CLIP-ViT-L-336px](https://huggingface.co/openai/clip-vit-large-patch14-336) as visual encoder (336*336 image resolution),
28
  [Vicuna-v1.5-13B](https://huggingface.co/lmsys/vicuna-13b-v1.5) as base LLM and a two-layer MLP as vision-language connector.
29