What's the base language model of the model?

by Labmem009 - opened Jan 16, 2024

Discussion

Labmem009

Jan 16, 2024

I wonder what's the base language model of the model? Because the weight is so small

atsumoto

Dialogue System Research Group at Nagoya University org Jan 16, 2024

Hi, Labmem009
We constructed our llava model with the following three modules (you can confirm this in config.json):

vision encoder: openai/clip-vit-large-patch14 (1.71 GB)
vision language connecter: 2-layer MLP
language model: rinna/japanese-gpt-neox-small (663 MB)

The model.safetensors file in this repo contains the entire llava model above, so the total size should be 1.8GB.
FYI, in training this llava model, we only updated the MLP and the LM using the stair dataset.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment