--- base_model: CausalLM/14B datasets: - JosephusCheung/GuanacoDataset - Open-Orca/OpenOrca - stingning/ultrachat - meta-math/MetaMathQA - liuhaotian/LLaVA-Instruct-150K - jondurbin/airoboros-3.1 - WizardLM/WizardLM_evol_instruct_V2_196k - RyokoAI/ShareGPT52K - RyokoAI/Fandom23K - milashkaarshif/MoeGirlPedia_wikitext_raw_archive - wikipedia - wiki_lingua - fnlp/moss-003-sft-data - garage-bAInd/Open-Platypus - LDJnr/Puffin - openbmb/llava_zh - BAAI/COIG - TigerResearch/tigerbot-zhihu-zh-10k - liwu/MNBVC - teknium/openhermes inference: false language: - en - zh license: wtfpl model_creator: CausalLM model_name: CausalLM 14B model_type: llama pipeline_tag: text-generation prompt_template: '<|im_start|>system {system_message}<|im_end|> <|im_start|>user {prompt}<|im_end|> <|im_start|>assistant ' quantized_by: cgus tags: - llama - llama2 --- # CausalLM 14B - GPTQ - Model creator: [CausalLM](https://huggingface.co/CausalLM) - Original model: [CausalLM 14B](https://huggingface.co/CausalLM/14B) ## Description Experimental exl2 quantization for CausalLM-14B for Exllamav2. I had some issues during quantization process, so I suspect it might have quality issues. 3.5bpw version barely fits 12GB VRAM but has unusually high perplexity for wikitext dataset. I couldn't measure perplexity for 4bpw version and to compare it with TheBloke's GPTQ, so I have no idea if my quantization has issues or it supposed to be like this. You could try this exl2 version but I'd recommend to use [TheBloke's GPTQ](https://huggingface.co/TheBloke/CausalLM-14B-GPTQ) version instead.