MarsupialAI
/

Llama3_GGUF_Quant_Testing

Inference Endpoints

Model card Files Files and versions Community

MarsupialAI commited on Apr 25

Commit

1f787c8

•

1 Parent(s): 21d3cb1

Update README.md

Files changed (1) hide show

README.md +6 -4

README.md CHANGED Viewed

@@ -9,13 +9,15 @@ So to test this crazy theory, I downloaded Undi95/Meta-Llama-3-8B-Instruct-hf an
 - fp32 specifically with `--outtype f32`
 - "Auto" with no outtype specified
-I then quantized each of these conversions to Q4_K_M and ran perplexity tests on everything using my abbreviated wiki.short.raw text file
 The results:
-As you can see, converting to fp32 has no meaningful effect on PPL.  There will no doubt be some people who will claim
-"PpL iSn'T gOoD eNoUgH!!1!".  For those people, I have uploaded all GGUFs used in this test.  Feel free to do more extensive
-testing on your own time.  I consider the matter resolved until somebody can conclusively demonstrate otherwise.

 - fp32 specifically with `--outtype f32`
 - "Auto" with no outtype specified
+I then quantized each of these conversions to Q4_K_M and ran perplexity tests on everything using my abbreviated wiki.short.raw
+text file
 The results:
+As you can see, converting to fp32 has no meaningful effect on PPL compared to converting to fp16.  There will no doubt be some
+people who will claim "PpL iSn'T gOoD eNoUgH!!1!".  For those people, I have uploaded all GGUFs used in this test.  Feel free to
+use those files to do more extensive testing on your own time.  I consider the matter resolved until somebody can conclusively
+demonstrate otherwise.