nisten
/

qwenv2-7b-inst-imatrix-gguf

Model card Files Files and versions Community

qwenv2-7b-inst-imatrix-gguf

1 contributor

History: 11 commits

nisten's picture

Probably best speed to perplexity ratio of any 7b gguf model so far

0e76852 verified about 2 months ago

.gitattributes
2.5 kB

Probably best speed to perplexity ratio of any 7b gguf model so far about 2 months ago
8bitimatrix.dat
4.54 MB
LFS

calculated imatrix in 8bit, was jsut as good as f16 imatrix about 2 months ago
README.md
1.55 kB

Update README.md about 2 months ago
qwen7bf16.gguf
15.2 GB
LFS

Upload 9 files about 2 months ago
qwen7bq4kembeddingf16outputf16.gguf
6.11 GB
LFS

Rename qwen7bq4kembeddingbf16outputbf16.gguf to qwen7bq4kembeddingf16outputf16.gguf about 2 months ago
qwen7bq4koutput8bit.gguf
4.82 GB
LFS

Upload 9 files about 2 months ago
qwen7bq4xsembedding8output8.gguf
4.64 GB
LFS

Rename qwen7bq4xsembedding5bitkoutput8bit.gguf to qwen7bq4xsembedding8output8.gguf about 2 months ago
qwen7bq4xsoutput6k.gguf
4.22 GB
LFS

Rename qwen7bq4xs.gguf to qwen7bq4xsoutput6k.gguf about 2 months ago
qwen7bv2_iq4xs_output8bit.gguf
4.35 GB
LFS

Probably best speed to perplexity ratio of any 7b gguf model so far about 2 months ago
qwen7bv2instruct_q5km.gguf
5.58 GB
LFS

standard q5km conversions with 8bit output for reference. about 2 months ago
qwenv2instruct7b_q8.gguf
8.1 GB
LFS

Good conversion from bf16 down instead of from f16 about 2 months ago