What's the base language model of the model?
#1
by
Labmem009
- opened
I wonder what's the base language model of the model? Because the weight is so small
Hi, Labmem009
We constructed our llava model with the following three modules (you can confirm this in config.json
):
- vision encoder: openai/clip-vit-large-patch14 (1.71 GB)
- vision language connecter: 2-layer MLP
- language model: rinna/japanese-gpt-neox-small (663 MB)
The model.safetensors
file in this repo contains the entire llava model above, so the total size should be 1.8GB.
FYI, in training this llava model, we only updated the MLP and the LM using the stair dataset.