dranger003
/

Smaug-72B-v0.1-iMat.GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Smaug-72B-v0.1-iMat.GGUF / README.md

dranger003's picture

Update README.md

5f26f87 verified 8 months ago

|

history blame contribute delete

1.31 kB

	---
	license: other
	license_name: tongyi-qianwen-license-agreement
	license_link: >-
	https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT
	pipeline_tag: text-generation
	---
	GGUF importance matrix (imatrix) quants for https://huggingface.co/abacusai/Smaug-72B-v0.1
	The importance matrix was trained for 100K tokens (200 batches of 512 tokens) using wiki.train.raw.

	Update 2024-03-14:
	* New quant IQ1_S using latest commit `4755afd1`.

	Update 2024-03-02:
	* New quants IQ2_S/IQ2_M, requires commit [a33e6a0d](https://github.com/ggerganov/llama.cpp/commit/a33e6a0d2a66104ea9a906bdbf8a94d050189d91) or later.
	* The importance matrix was trained for ~50K tokens (105 batches of 512 tokens) using a [general purpose imatrix calibration dataset](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384).
	* This is a different calibration dataset than the previous quants I posted so we can compare the quality

	Llama-2 conversation template and system prompt set to the [Qwen system prompt](https://github.com/QwenLM/Qwen/blob/main/examples/system_prompt.md).

	\| Layers \| Context \| Template \|
	\| --- \| --- \| --- \|
	\| <pre>80</pre> \| <pre>32768</pre> \| <pre>[INST] \<\<SYS\>\><br>{instructions}<br>\<\</SYS\>\><br><br>{prompt} [/INST]<br>{response}</pre> \|