lamm-mit
/

Cephalo-Phi-3-vision-128k-4b-alpha

Image-Text-to-Text

text-generation

text-generation-inference

materials science

Model card Files Files and versions Community

mjbuehler commited on May 24

Commit

16f8316

•

1 Parent(s): e59f67c

Update README.md

Files changed (1) hide show

README.md +5 -4

README.md CHANGED Viewed

@@ -36,17 +36,18 @@ The model is developed to process diverse inputs, including images and text, fac
 Cephalo provides a robust framework for multimodal interaction and understanding, including the development of complex generative pipelines to create 2D and 3D renderings of material microstructures as input for additive manufacturing methods.
-This version of Cephalo, lamm-mit/Cephalo-Phi-3-vision-128k-4b, is based on the Phi-3-Vision-128K-Instruct model. The model has a context length of 128,000 tokens. Further details, see: https://huggingface.co/microsoft/Phi-3-vision-128k-instruct.
 ### Chat Format
-Given the nature of the training data, the Cephalo-Phi-3-vision-128k-4b model is best suited for a single image input wih prompts using the chat format as follows.
 You can provide the prompt as a single image with a generic template as follow:
 ```markdown
 <|user|>\n<|image_1|>\n{prompt}<|end|>\n<|assistant|>\n
 ```
-where the model generates the text after `<|assistant|>` . For multi-turn conversations, the prompt should be formatted as follows:
 ```markdown
 <|user|>\n<|image_1|>\n{prompt_1}<|end|>\n<|assistant|>\n{response_1}<|end|>\n<|user|>\n{prompt_2}<|end|>\n<|assistant|>\n
@@ -62,7 +63,7 @@ import requests
 from transformers import AutoModelForCausalLM
 from transformers import AutoProcessor
-model_id = "lamm-mit/Cephalo-Phi-3-vision-128k-4b"
 model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cuda", trust_remote_code=True, torch_dtype="auto")

 Cephalo provides a robust framework for multimodal interaction and understanding, including the development of complex generative pipelines to create 2D and 3D renderings of material microstructures as input for additive manufacturing methods.
+This version of Cephalo, lamm-mit/Cephalo-Phi-3-vision-128k-4b-alpha, is based on the Phi-3-Vision-128K-Instruct model. The model has a context length of 128,000 tokens. Further details, see: https://huggingface.co/microsoft/Phi-3-vision-128k-instruct.
 ### Chat Format
+Given the nature of the training data, the Cephalo-Phi-3-vision-128k-4b-alpha model is best suited for a single image input wih prompts using the chat format as follows.
 You can provide the prompt as a single image with a generic template as follow:
 ```markdown
 <|user|>\n<|image_1|>\n{prompt}<|end|>\n<|assistant|>\n
 ```
+The model generates the text after `<|assistant|>` . For multi-turn conversations, the prompt should be formatted as follows:
 ```markdown
 <|user|>\n<|image_1|>\n{prompt_1}<|end|>\n<|assistant|>\n{response_1}<|end|>\n<|user|>\n{prompt_2}<|end|>\n<|assistant|>\n
 from transformers import AutoModelForCausalLM
 from transformers import AutoProcessor
+model_id = "lamm-mit/Cephalo-Phi-3-vision-128k-4b-alpha"
 model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cuda", trust_remote_code=True, torch_dtype="auto")