c4ai-command-r-v01 in one gguf

#3
by Markobes - opened

Hi,
It looks like this is the latest version (v01). And I'm looking for models in one file for ollama, jan etc., since I use them locally on RTX 3090 Ti. You manage to compress and accelerate them, that would be great.
Thak you.

Unsloth AI org

Hi,
It looks like this is the latest version (v01). And I'm looking for models in one file for ollama, jan etc., since I use them locally on RTX 3090 Ti. You manage to compress and accelerate them, that would be great.
Thak you.

Apologies could you elaborate your question? I'm not sure I understand. Do you mean like a GGUF file?

That's right, sorry for the inaccuracy. My main task is literary translation of subtitles. Most models are very bad at this. It can even be used as a test. For example, aya-expanse-32b.gguf (and Aya32 too) invents non-existent words and breaks the structure of the document in the chat. The Command-r, along with the Gemini-2, are one of the few models that handle this well. Therefore, I aim to select up to ~30B that can be used productively on my computer (HP Z8 G4, RAM 64GB, NV RTX 3090Ti). The quantized Google Flan-T5 model simply refused to work in the chat for unknown reasons, etc.

Sign up or log in to comment