YenChunChen commited on
Commit
ff0d44e
1 Parent(s): bfc0c05

update readme regarding FA2

Browse files
Files changed (1) hide show
  1. README.md +1 -19
README.md CHANGED
@@ -105,7 +105,7 @@ from transformers import AutoProcessor
105
 
106
  model_id = "microsoft/Phi-3-vision-128k-instruct"
107
 
108
- model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cuda", trust_remote_code=True, torch_dtype="auto")
109
 
110
  processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
111
 
@@ -217,24 +217,6 @@ Note that by default, the Phi-3-Vision-128K model uses flash attention, which re
217
  * NVIDIA A6000
218
  * NVIDIA H100
219
 
220
- ### Running on Windows or without flash attention
221
- To enable the model on these enviroment here are steps that you may consider to follow:
222
-
223
- Step 1: comment flash attention import code in modeling_phi3_v.py from line 52 to line 56.
224
- ```python
225
- # if is_flash_attn_2_available():
226
- # from flash_attn import flash_attn_func, flash_attn_varlen_func
227
- # from flash_attn.bert_padding import index_first_axis, pad_input, unpad_input # noqa
228
-
229
- # _flash_supports_window_size = "window_size" in list(inspect.signature(flash_attn_func).parameters)
230
- ```
231
-
232
- Step 2: change _"_attn_implementation"_ from _"flash_attention_2"_ to _"eager"_ in config.json or disable flash attention when you create the model as below.
233
-
234
- ```python
235
- model = AutoModelForCausalLM.from_pretrained('microsoft/Phi-3-vision-128k-instruct', device_map="cuda", trust_remote_code=True, torch_dtype="auto", _attn_implementation="eager")
236
- ```
237
-
238
  ## License
239
 
240
  The model is licensed under the [MIT license](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct/resolve/main/LICENSE).
 
105
 
106
  model_id = "microsoft/Phi-3-vision-128k-instruct"
107
 
108
+ model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cuda", trust_remote_code=True, torch_dtype="auto", attn_implementation='eager') # use attn_implementation='flash_attention_2' to enable flash attention
109
 
110
  processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
111
 
 
217
  * NVIDIA A6000
218
  * NVIDIA H100
219
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
220
  ## License
221
 
222
  The model is licensed under the [MIT license](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct/resolve/main/LICENSE).