Virt-io
/

FuseChat-7B-VaRM-GGUF

Inference Endpoints

Model card Files Files and versions Community

FuseChat-7B-VaRM-GGUF / README.md

Virt-io's picture

Update README.md

6d62c75 verified 8 months ago

|

550 Bytes

	GGUF for [FuseChat-7B-VaRM](https://huggingface.co/FuseAI/FuseChat-7B-VaRM) using [Capybara-Binarized](https://huggingface.co/datasets/jan-hq/ldjnr_capybara_binarized) for Imatrix at 8k context with Q8 model
	(Only got 1500 chunks through the dataset, I got tired of waiting :\| cries in 6GB vram)

	Sillytavern template inside presets folder (unsure if it's correct)

	I wouldn't go lower than IQ4_XS, IQ3_XXS and IQ3_XS work but they're a little dumb

	IQ1_S is unusable too dumb and has a repetition problem

	Imatrix was also used for Q4_K_M and Q5_K_M