Update README.md
Browse files
README.md
CHANGED
@@ -65,6 +65,10 @@ Good quants for reading (prompt eval speed) are BF16, F16, Q4\_0, and
|
|
65 |
Q8\_0 (ordered from fastest to slowest). Prompt evaluation is bounded by
|
66 |
computation speed (flops) so simpler quants help.
|
67 |
|
|
|
|
|
|
|
|
|
68 |
Note: BF16 is currently only supported on CPU.
|
69 |
|
70 |
## Hardware Choices (LLaMA3 70B Specific)
|
|
|
65 |
Q8\_0 (ordered from fastest to slowest). Prompt evaluation is bounded by
|
66 |
computation speed (flops) so simpler quants help.
|
67 |
|
68 |
+
Files which exceed the HF 50GB upload limit have a .cat𝑋 extension. You
|
69 |
+
need to use the `cat` command locally to turn them back into a single
|
70 |
+
file, using the same order.
|
71 |
+
|
72 |
Note: BF16 is currently only supported on CPU.
|
73 |
|
74 |
## Hardware Choices (LLaMA3 70B Specific)
|