hierholzer commited on
Commit
8ac9b95
1 Parent(s): e899c1b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +56 -0
README.md ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ ---
6
+
7
+
8
+ ---
9
+
10
+ # Model
11
+
12
+
13
+ Here is a Quantized version of Llama-3.1-70B-Instruct using GGUF
14
+
15
+ GGUF is designed for use with GGML and other executors.
16
+ GGUF was developed by @ggerganov who is also the developer of llama.cpp, a popular C/C++ LLM inference framework.
17
+ Models initially developed in frameworks like PyTorch can be converted to GGUF format for use with those engines.
18
+
19
+
20
+ ## Uploaded Quantization Types
21
+
22
+ Currently, I have uploaded 2 quantized versions:
23
+
24
+ Q5_K_M : - large, very low quality loss
25
+ and
26
+ Q8_0 : - very large, extremely low quality loss
27
+
28
+ ### All Quantization Types Possible
29
+
30
+ Here are all of the Quantization Types that are Possible. Let me know if you need any other versions
31
+
32
+ 2 or Q4_0 : - small, very high quality loss - legacy, prefer using Q3_K_M
33
+ 3 or Q4_1 : - small, substantial quality loss - legacy, prefer using Q3_K_L
34
+ 8 or Q5_0 : - medium, balanced quality - legacy, prefer using Q4_K_M
35
+ 9 or Q5_1 : - medium, low quality loss - legacy, prefer using Q5_K_M
36
+ 10 or Q2_K : - smallest, extreme quality loss - not recommended
37
+ 12 or Q3_K : alias for Q3_K_M
38
+ 11 or Q3_K_S : - very small, very high quality loss
39
+ 12 or Q3_K_M : - very small, very high quality loss
40
+ 13 or Q3_K_L : - small, substantial quality loss
41
+ 15 or Q4_K : alias for Q4_K_M
42
+ 14 or Q4_K_S : - small, significant quality loss
43
+ 15 or Q4_K_M : - medium, balanced quality - *recommended*
44
+ 17 or Q5_K : alias for Q5_K_M
45
+ 16 or Q5_K_S : - large, low quality loss - *recommended*
46
+ 17 or Q5_K_M : - large, very low quality loss - *recommended*
47
+ 18 or Q6_K : - very large, extremely low quality loss
48
+ 7 or Q8_0 : - very large, extremely low quality loss - not recommended
49
+ 1 or F16 : - extremely large, virtually no quality loss - not recommended
50
+ 0 or F32 : - absolutely huge, lossless - not recommended
51
+
52
+
53
+
54
+ ## Uses
55
+
56
+ By using the GGUF version of Llama-3.1-70B-Instruct, you will be able to run this LLM while having to use significantly less resources than you would using the non quantized version.