mjbuehler commited on
Commit
16f8316
1 Parent(s): e59f67c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -36,17 +36,18 @@ The model is developed to process diverse inputs, including images and text, fac
36
 
37
  Cephalo provides a robust framework for multimodal interaction and understanding, including the development of complex generative pipelines to create 2D and 3D renderings of material microstructures as input for additive manufacturing methods.
38
 
39
- This version of Cephalo, lamm-mit/Cephalo-Phi-3-vision-128k-4b, is based on the Phi-3-Vision-128K-Instruct model. The model has a context length of 128,000 tokens. Further details, see: https://huggingface.co/microsoft/Phi-3-vision-128k-instruct.
40
 
41
  ### Chat Format
42
 
43
- Given the nature of the training data, the Cephalo-Phi-3-vision-128k-4b model is best suited for a single image input wih prompts using the chat format as follows.
 
44
  You can provide the prompt as a single image with a generic template as follow:
45
  ```markdown
46
  <|user|>\n<|image_1|>\n{prompt}<|end|>\n<|assistant|>\n
47
  ```
48
 
49
- where the model generates the text after `<|assistant|>` . For multi-turn conversations, the prompt should be formatted as follows:
50
 
51
  ```markdown
52
  <|user|>\n<|image_1|>\n{prompt_1}<|end|>\n<|assistant|>\n{response_1}<|end|>\n<|user|>\n{prompt_2}<|end|>\n<|assistant|>\n
@@ -62,7 +63,7 @@ import requests
62
  from transformers import AutoModelForCausalLM
63
  from transformers import AutoProcessor
64
 
65
- model_id = "lamm-mit/Cephalo-Phi-3-vision-128k-4b"
66
 
67
  model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cuda", trust_remote_code=True, torch_dtype="auto")
68
 
 
36
 
37
  Cephalo provides a robust framework for multimodal interaction and understanding, including the development of complex generative pipelines to create 2D and 3D renderings of material microstructures as input for additive manufacturing methods.
38
 
39
+ This version of Cephalo, lamm-mit/Cephalo-Phi-3-vision-128k-4b-alpha, is based on the Phi-3-Vision-128K-Instruct model. The model has a context length of 128,000 tokens. Further details, see: https://huggingface.co/microsoft/Phi-3-vision-128k-instruct.
40
 
41
  ### Chat Format
42
 
43
+ Given the nature of the training data, the Cephalo-Phi-3-vision-128k-4b-alpha model is best suited for a single image input wih prompts using the chat format as follows.
44
+
45
  You can provide the prompt as a single image with a generic template as follow:
46
  ```markdown
47
  <|user|>\n<|image_1|>\n{prompt}<|end|>\n<|assistant|>\n
48
  ```
49
 
50
+ The model generates the text after `<|assistant|>` . For multi-turn conversations, the prompt should be formatted as follows:
51
 
52
  ```markdown
53
  <|user|>\n<|image_1|>\n{prompt_1}<|end|>\n<|assistant|>\n{response_1}<|end|>\n<|user|>\n{prompt_2}<|end|>\n<|assistant|>\n
 
63
  from transformers import AutoModelForCausalLM
64
  from transformers import AutoProcessor
65
 
66
+ model_id = "lamm-mit/Cephalo-Phi-3-vision-128k-4b-alpha"
67
 
68
  model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cuda", trust_remote_code=True, torch_dtype="auto")
69