MarsupialAI
commited on
Commit
•
0e51047
1
Parent(s):
1f787c8
Update README.md
Browse files
README.md
CHANGED
@@ -10,11 +10,13 @@ So to test this crazy theory, I downloaded Undi95/Meta-Llama-3-8B-Instruct-hf an
|
|
10 |
- "Auto" with no outtype specified
|
11 |
|
12 |
I then quantized each of these conversions to Q4_K_M and ran perplexity tests on everything using my abbreviated wiki.short.raw
|
13 |
-
text file
|
14 |
-
|
15 |
-
The results:
|
16 |
-
|
17 |
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
|
20 |
As you can see, converting to fp32 has no meaningful effect on PPL compared to converting to fp16. There will no doubt be some
|
|
|
10 |
- "Auto" with no outtype specified
|
11 |
|
12 |
I then quantized each of these conversions to Q4_K_M and ran perplexity tests on everything using my abbreviated wiki.short.raw
|
13 |
+
text file. The results:
|
|
|
|
|
|
|
14 |
|
15 |
+
````
|
16 |
+
FP16 specified: size 14.9GB PPL @ fp16 9.5158 +/- 0.15418 PPL @ Q4km 9.6414 +/- 0.15494
|
17 |
+
FP32 specified: size 29.9GB PPL @ fp32 9.5158 +/- 0.15418 PPL @ Q4km 9.6278 +/- 0.15466
|
18 |
+
None specified: size 29.9GB PPL @ ???? 9.5158 +/- 0.15418 PPL @ Q4km 9.6278 +/- 0.15466
|
19 |
+
````
|
20 |
|
21 |
|
22 |
As you can see, converting to fp32 has no meaningful effect on PPL compared to converting to fp16. There will no doubt be some
|