Configuration Parsing Warning: In config.json: "quantization_config.bits" must be an integer

Notes

  • 3.75bpw test quant of CausalLM/35b-beta-long, which is in itself a finetune of CohereForAI/c4ai-command-r-v01 (hence the corrected licensing).
  • Theoretically should fit within 24GB of VRAM for inference.

TBA

Tokenizer is different from cohere - and chat template is ChatML - fully fine-tuned at 128K+

No loras, no quants, no tricks, 30M+ sft data.

Pressure Testing from: https://github.com/LeonEricsson/llmcontext

image/png

Downloads last month
6
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train txin/35b-beta-long-3.75bpw-exl2