txin
/

35b-beta-long-3.75bpw-exl2

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

35b-beta-long-3.75bpw-exl2 / README.md

txin's picture

Upload 8 files

c80a036 verified about 2 months ago

|

raw history blame contribute delete

No virus

1.14 kB

	---
	license: cc-by-nc-4.0
	language:
	- en
	- zh
	- ja
	- de
	datasets:
	- JosephusCheung/GuanacoDataset
	- meta-math/MetaMathQA
	- jondurbin/airoboros-3.1
	- WizardLM/WizardLM_evol_instruct_V2_196k
	- RyokoAI/ShareGPT52K
	- RyokoAI/Fandom23K
	- milashkaarshif/MoeGirlPedia_wikitext_raw_archive
	- wikipedia
	- wiki_lingua
	- garage-bAInd/Open-Platypus
	- LDJnr/Puffin
	- BAAI/COIG
	- TigerResearch/tigerbot-zhihu-zh-10k
	- liwu/MNBVC
	- teknium/openhermes
	- CausalLM/Refined-Anime-Text
	- microsoft/orca-math-word-problems-200k
	- m-a-p/CodeFeedback-Filtered-Instruction
	---
	# Notes
	- 3.75bpw test quant of CausalLM/35b-beta-long, which is in itself a finetune of CohereForAI/c4ai-command-r-v01 (hence the corrected licensing).
	- Theoretically should fit within 24GB of VRAM for inference.

	## TBA

	Tokenizer is different from cohere - and chat template is ChatML - fully fine-tuned at 128K+

	No loras, no quants, no tricks, 30M+ sft data.

	Pressure Testing from: https://github.com/LeonEricsson/llmcontext

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/63468a143ea42ee2cb49ddd1/2XbONpyTeMH1qWCtE9ziH.png)