aarticerebras commited on
Commit
b26a998
·
verified ·
1 Parent(s): 374b680

Update README.md (#1)

Browse files

- Update README.md (073294f0e228d9c479d58aee01cb3be547c45c8d)

Files changed (1) hide show
  1. README.md +37 -0
README.md CHANGED
@@ -1,3 +1,40 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ # Model Card for cerebras/Cerebras-LLaVA-7B
5
+
6
+ The checkpoints consists of Language encoder and projector weights of multimodal LLaVA-7B model trained with our Cerebras implementation and training recipe.
7
+ The vision encoder checkpoints for this model can be found at [cerebras/Cerebras-ViT-L-336-patch14-llava7b-ShareGPT4V](https://huggingface.co/cerebras/Cerebras-ViT-L-336-patch14-llava7b-ShareGPT4V)
8
+
9
+ **Note**: _ShareGPT4V_ is added to the vision model name to ensure correct loading of checkpoints in [LLaVA source repo](https://github.com/haotian-liu/LLaVA/blob/main/llava/model/multimodal_encoder/builder.py#L8)
10
+
11
+ For full details of this model and training details, please read our paper and release blog post **to be released shortly**.
12
+
13
+ # Model Architecture
14
+ Cerebras-LLaVA-7B is a transformer model with the following architecture details
15
+ * Vision encoder: [CLIP-VisionModel-Large](cerebras/Cerebras-ViT-L-336-patch14-llava7b-ShareGPT4V). It handles images of size 336 x 336 with patch size of 14
16
+ * Large Language Model: Pretrained from Vicuna-7B checkpoints and instruction finetuned on various datasets.
17
+ * Projector: the projector module that connects the LLM and Vision encoder part consists of two linear layers with gelu activation (mlp2x-gelu)
18
+
19
+ # Loading the model
20
+
21
+ This model can directly be loaded using the [LLaVa source code repository](https://github.com/haotian-liu/LLaVA). For installation, please refer to the [instructions in source code repository](https://github.com/haotian-liu/LLaVA?tab=readme-ov-file#install).
22
+
23
+ ```
24
+ from llava.model.builder import load_pretrained_model
25
+ from llava.mm_utils import get_model_name_from_path
26
+ from llava.eval.run_llava import eval_model
27
+
28
+ model_path = "cerebras/Cerebras-LLaVA-7B"
29
+
30
+ tokenizer, model, image_processor, context_len = load_pretrained_model(
31
+ model_path=model_path,
32
+ model_base=None,
33
+ model_name=get_model_name_from_path(model_path)
34
+ )
35
+ ```
36
+
37
+ # Acknowledgements
38
+ We are thankful to all Cerebras engineers, past and present, that made this work possible.
39
+
40
+