Configuration Parsing Warning: In config.json: "quantization_config.bits" must be an integer

Notes

3.75bpw test quant of CausalLM/35b-beta-long, which is in itself a finetune of CohereForAI/c4ai-command-r-v01 (hence the corrected licensing).
Theoretically should fit within 24GB of VRAM for inference.

TBA

Tokenizer is different from cohere - and chat template is ChatML - fully fine-tuned at 128K+

No loras, no quants, no tricks, 30M+ sft data.

Pressure Testing from: https://github.com/LeonEricsson/llmcontext

Downloads last month: 6

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

txin
/

35b-beta-long-3.75bpw-exl2

Notes

TBA

Datasets used to train txin/35b-beta-long-3.75bpw-exl2