legraphista commited on
Commit
3048fc1
1 Parent(s): 113fd1c

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +28 -3
README.md CHANGED
@@ -22,6 +22,20 @@ Original dtype: `BF16` (`bfloat16`)
22
  Quantized by: llama.cpp [https://github.com/ggerganov/llama.cpp/pull/7519](https://github.com/ggerganov/llama.cpp/releases/tag/https://github.com/ggerganov/llama.cpp/pull/7519)
23
  IMatrix dataset: [here](https://gist.githubusercontent.com/legraphista/d6d93f1a254bcfc58e0af3777eaec41e/raw/d380e7002cea4a51c33fffd47db851942754e7cc/imatrix.calibration.medium.raw)
24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  ## Files
26
 
27
  ### IMatrix
@@ -64,20 +78,31 @@ Link: [here](https://huggingface.co/legraphista/DeepSeek-V2-Lite-IMat-GGUF/blob/
64
 
65
 
66
  ## Downloading using huggingface-cli
67
- First, make sure you have hugginface-cli installed:
68
  ```
69
  pip install -U "huggingface_hub[cli]"
70
  ```
71
- Then, you can target the specific file you want:
72
  ```
73
  huggingface-cli download legraphista/DeepSeek-V2-Lite-IMat-GGUF --include "DeepSeek-V2-Lite.Q8_0.gguf" --local-dir ./
74
  ```
75
- If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:
76
  ```
77
  huggingface-cli download legraphista/DeepSeek-V2-Lite-IMat-GGUF --include "DeepSeek-V2-Lite.Q8_0/*" --local-dir DeepSeek-V2-Lite.Q8_0
78
  # see FAQ for merging GGUF's
79
  ```
80
 
 
 
 
 
 
 
 
 
 
 
 
81
  ## FAQ
82
 
83
  ### Why is the IMatrix not applied everywhere?
 
22
  Quantized by: llama.cpp [https://github.com/ggerganov/llama.cpp/pull/7519](https://github.com/ggerganov/llama.cpp/releases/tag/https://github.com/ggerganov/llama.cpp/pull/7519)
23
  IMatrix dataset: [here](https://gist.githubusercontent.com/legraphista/d6d93f1a254bcfc58e0af3777eaec41e/raw/d380e7002cea4a51c33fffd47db851942754e7cc/imatrix.calibration.medium.raw)
24
 
25
+ - [DeepSeek-V2-Lite-IMat-GGUF](#deepseek-v2-lite-imat-gguf)
26
+ - [Files](#files)
27
+ - [IMatrix](#imatrix)
28
+ - [Common Quants](#common-quants)
29
+ - [All Quants](#all-quants)
30
+ - [Downloading using huggingface-cli](#downloading-using-huggingface-cli)
31
+ - [Inference](#inference)
32
+ - [Llama.cpp](#llama-cpp)
33
+ - [FAQ](#faq)
34
+ - [Why is the IMatrix not applied everywhere?](#why-is-the-imatrix-not-applied-everywhere)
35
+ - [How do I merge a split GGUF?](#how-do-i-merge-a-split-gguf)
36
+
37
+ ---
38
+
39
  ## Files
40
 
41
  ### IMatrix
 
78
 
79
 
80
  ## Downloading using huggingface-cli
81
+ If you do not have hugginface-cli installed:
82
  ```
83
  pip install -U "huggingface_hub[cli]"
84
  ```
85
+ Download the specific file you want:
86
  ```
87
  huggingface-cli download legraphista/DeepSeek-V2-Lite-IMat-GGUF --include "DeepSeek-V2-Lite.Q8_0.gguf" --local-dir ./
88
  ```
89
+ If the model file is big, it has been split into multiple files. In order to download them all to a local folder, run:
90
  ```
91
  huggingface-cli download legraphista/DeepSeek-V2-Lite-IMat-GGUF --include "DeepSeek-V2-Lite.Q8_0/*" --local-dir DeepSeek-V2-Lite.Q8_0
92
  # see FAQ for merging GGUF's
93
  ```
94
 
95
+ ---
96
+
97
+ ## Inference
98
+
99
+ ### Llama.cpp
100
+ ```
101
+ llama.cpp/main -m DeepSeek-V2-Lite.Q8_0.gguf --color -i -p "prompt here"
102
+ ```
103
+
104
+ ---
105
+
106
  ## FAQ
107
 
108
  ### Why is the IMatrix not applied everywhere?