JosefAlbers commited on
Commit
0d1c555
1 Parent(s): 00619d6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -3
README.md CHANGED
@@ -9,7 +9,8 @@ tags:
9
  - llm
10
  - phi
11
  ---
12
- # Phi-3-Vision VLM Model for Apple MLX: An All-in-One Port
 
13
 
14
  This project brings the powerful phi-3-vision VLM to Apple's MLX framework, offering a comprehensive solution for various text and image processing tasks. With a focus on simplicity and efficiency, this implementation offers a straightforward and minimalistic integration of the VLM model. It seamlessly incorporates essential functionalities such as generating quantized model weights, optimizing KV cache quantization during inference, facilitating LoRA/QLoRA training, and conducting model benchmarking, all encapsulated within a single file for convenient access and usage.
15
 
@@ -27,6 +28,28 @@ This project brings the powerful phi-3-vision VLM to Apple's MLX framework, offe
27
 
28
  ## Quick Start
29
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
  ### **VLM Agent** (WIP)
31
 
32
  VLM's understanding of both text and visuals enables interactive generation and modification of plots/images, opening up new possibilities for GUI development and data visualization.
@@ -189,7 +212,7 @@ Generation: 8.56 tokens-per-sec (100 tokens / 11.6 sec)
189
  ### **LoRA Testing** (WIP)
190
 
191
  ```python
192
- # from phi_3_vision_mlx import recall
193
 
194
  test_lora(dataset_path="JosefAlbers/akemiH_MedQA_Reason"):
195
  ```
@@ -321,4 +344,4 @@ This project is licensed under the [MIT License](LICENSE).
321
 
322
  ## Citation
323
 
324
- <a href="https://zenodo.org/doi/10.5281/zenodo.11403221"><img src="https://zenodo.org/badge/806709541.svg" alt="DOI"></a>
 
9
  - llm
10
  - phi
11
  ---
12
+
13
+ # Phi-3-Vision for Apple MLX
14
 
15
  This project brings the powerful phi-3-vision VLM to Apple's MLX framework, offering a comprehensive solution for various text and image processing tasks. With a focus on simplicity and efficiency, this implementation offers a straightforward and minimalistic integration of the VLM model. It seamlessly incorporates essential functionalities such as generating quantized model weights, optimizing KV cache quantization during inference, facilitating LoRA/QLoRA training, and conducting model benchmarking, all encapsulated within a single file for convenient access and usage.
16
 
 
28
 
29
  ## Quick Start
30
 
31
+ **1. Install Phi-3 Vision MLX:**
32
+
33
+ ```bash
34
+ git clone https://github.com/JosefAlbers/Phi-3-Vision-MLX.git
35
+ ```
36
+
37
+ **2. Launch Phi-3 Vision MLX:**
38
+
39
+ ```bash
40
+ phi3v
41
+ ```
42
+
43
+ Or,
44
+
45
+ ```python
46
+ from phi_3_vision_mlx import chatui
47
+
48
+ chatui()
49
+ ```
50
+
51
+ ## Usage
52
+
53
  ### **VLM Agent** (WIP)
54
 
55
  VLM's understanding of both text and visuals enables interactive generation and modification of plots/images, opening up new possibilities for GUI development and data visualization.
 
212
  ### **LoRA Testing** (WIP)
213
 
214
  ```python
215
+ # from phi_3_vision_mlx import test_lora
216
 
217
  test_lora(dataset_path="JosefAlbers/akemiH_MedQA_Reason"):
218
  ```
 
344
 
345
  ## Citation
346
 
347
+ <a href="https://zenodo.org/doi/10.5281/zenodo.11403221"><img src="https://zenodo.org/badge/806709541.svg" alt="DOI"></a>