README.md · cortexso/llava-v1.6 at main

metadata

license: llama2
pipeline_tag: text-generation
Tags:
  - cortex.cpp
  - multimodal
  - vicuna
  - vision-language

Overview

LLaVA (Large Language and Vision Assistant) is an open-source chatbot trained to handle multimodal instruction-following tasks. It is a fine-tuned Vicuna-7B model, designed to process both text and image inputs. This auto-regressive language model leverages the transformer architecture to improve interactions in vision-language tasks, making it useful for research in computer vision, natural language processing, machine learning, and artificial intelligence.

LLaVA-v1.6-Vicuna-7B is the latest iteration, trained in December 2023, and optimized for improved instruction-following performance in multimodal settings.

Variants

No	Variant	Cortex CLI command
1	llava-v1.6-vicuna-7b-f16	`cortex run llava-v1.6:gguf-f16`
2	llava-v1.6-vicuna-7b-q4_km	`cortex run llava-v1.6:gguf-q4-km`

Use it with Jan (UI)

Install Jan using Quickstart
Use in Jan model Hub:
```
cortexso/llava-v1.6
```

Use it with Cortex (CLI)

Install Cortex using Quickstart
Run the model with command:
```
cortex run llava-v1.6
```

Credits

Author: LLaVA Research Team
Converter: Homebrew
Original License: LLAMA 2 Community License
Papers: LLaVA-v1.6: Enhancing Large Multimodal Models