wwwaj commited on
Commit
1e56361
1 Parent(s): edb43c8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -1
README.md CHANGED
@@ -216,7 +216,6 @@ Note that by default, the Phi-3-mini model uses flash attention, which requires
216
 
217
  If you want to run the model on:
218
  * NVIDIA V100 or earlier generation GPUs: call AutoModelForCausalLM.from_pretrained() with attn_implementation="eager"
219
- * CPU: use the GGUF quantized models 4K
220
  * Optimized inference on GPU, CPU, and Mobile: use the **ONNX** models [128K](https://aka.ms/phi3-mini-128k-instruct-onnx)
221
 
222
  ## Cross Platform Support
 
216
 
217
  If you want to run the model on:
218
  * NVIDIA V100 or earlier generation GPUs: call AutoModelForCausalLM.from_pretrained() with attn_implementation="eager"
 
219
  * Optimized inference on GPU, CPU, and Mobile: use the **ONNX** models [128K](https://aka.ms/phi3-mini-128k-instruct-onnx)
220
 
221
  ## Cross Platform Support