TBA
Tokenizer is different from cohere - and chat template is ChatML - fully fine-tuned at 128K+
No loras, no quants, no tricks.
Pressure Testing from: https://github.com/LeonEricsson/llmcontext
Tokenizer is different from cohere - and chat template is ChatML - fully fine-tuned at 128K+
No loras, no quants, no tricks.
Pressure Testing from: https://github.com/LeonEricsson/llmcontext