PacmanIncarnate commited on
Commit
772fba3
1 Parent(s): 669e242

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -5
README.md CHANGED
@@ -1,8 +1,10 @@
1
- <img src="Faraday Model Repository Banner.png" alt="Faraday.dev" style="width: 20%; min-width: 32px; display: block; horizontal align: left;">
2
- # Faraday.dev Model Repository
3
- Conveniently download this model from the Faraday.dev app model manager.
4
- - [Download Faraday here to get started.](https://faraday.dev/)
5
- - Request Additional GGUF models at [r/LLM_Quants](https://www.reddit.com/r/LLM_Quants/s/iizaX3acGa)
 
 
6
 
7
  ***
8
 
@@ -20,6 +22,24 @@ GGUF is a large language model (LLM) format that can be split between CPU and GP
20
 
21
  GGUF models are quantized to reduce resource usage, with a tradeoff of reduced coherence at lower quantizations. Quantization reduces the precision of the model weights by changing the number of bits used for each weight.
22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  ***
24
 
25
  ## Faraday.dev
 
1
+ <img src="Faraday Model Repository Banner.png" alt="Faraday.dev" style="height: 150px; min-width: 32px; display: block; margin: auto;">
2
+
3
+ **<p style="text-align: center;">The official library of GGUF format models for use in the local AI chat app, Faraday.dev.</p>**
4
+
5
+ <p style="text-align: center;"><a href="https://faraday.dev/">Download Faraday here to get started.</a></p>
6
+
7
+ <p style="text-align: center;"><a href="https://www.reddit.com/r/LLM_Quants/">Request Additional models at r/LLM_Quants.</a></p>
8
 
9
  ***
10
 
 
22
 
23
  GGUF models are quantized to reduce resource usage, with a tradeoff of reduced coherence at lower quantizations. Quantization reduces the precision of the model weights by changing the number of bits used for each weight.
24
 
25
+ ### 7B Quantization Chart
26
+ *Memory required must be less than your available RAM.*
27
+
28
+ | Quant method | Size | Memory required at 4K Context |
29
+ | --- | --- | --- |
30
+ | Q2_K | 2.72 GB| 5.22 GB |
31
+ | Q3_K_S | 3.16 GB| 5.66 GB |
32
+ | Q3_K_M | 3.52 GB| 6.02 GB |
33
+ | Q3_K_L | 3.82 GB| 6.32 GB |
34
+ | Q4_0 | 4.11 GB| 6.61 GB |
35
+ | Q4_K_S | 4.14 GB| 6.64 GB |
36
+ | Q4_K_M | 4.37 GB| 6.87 GB |
37
+ | Q5_0 | 5.00 GB| 7.50 GB |
38
+ | Q5_K_S | 5.00 GB| 7.50 GB |
39
+ | Q5_K_M | 5.13 GB| 7.63 GB |
40
+ | Q6_K | 5.94 GB| 8.44 GB |
41
+ | Q8_0 | 7.70 GB| 10.20 GB |
42
+
43
  ***
44
 
45
  ## Faraday.dev