About:
This is a llava model using tinyllama as its language model and openai/clip-vit-l-14-336 as its vision tower. Multi-modal projection layers are untrained as of now.
- Downloads last month
- 11
This is a llava model using tinyllama as its language model and openai/clip-vit-l-14-336 as its vision tower. Multi-modal projection layers are untrained as of now.