unsloth/c4ai-command-r-08-2024-bnb-4bit · c4ai-command-r-v01 in one gguf

6 days ago

Hi,
It looks like this is the latest version (v01). And I'm looking for models in one file for ollama, jan etc., since I use them locally on RTX 3090 Ti. You manage to compress and accelerate them, that would be great.
Thak you.

shimmyshimmer

Unsloth AI org 6 days ago

Hi,
It looks like this is the latest version (v01). And I'm looking for models in one file for ollama, jan etc., since I use them locally on RTX 3090 Ti. You manage to compress and accelerate them, that would be great.
Thak you.

Apologies could you elaborate your question? I'm not sure I understand. Do you mean like a GGUF file?

Markobes

6 days ago

•

edited 6 days ago

That's right, sorry for the inaccuracy. My main task is literary translation of subtitles. Most models are very bad at this. It can even be used as a test. For example, aya-expanse-32b.gguf (and Aya32 too) invents non-existent words and breaks the structure of the document in the chat. The Command-r, along with the Gemini-2, are one of the few models that handle this well. Therefore, I aim to select up to ~30B that can be used productively on my computer (HP Z8 G4, RAM 64GB, NV RTX 3090Ti). The quantized Google Flan-T5 model simply refused to work in the chat for unknown reasons, etc.