File size: 3,728 Bytes

6d25e83
c427a87
 
 
 
 
 
 
 
 
 
 
08411c2
6d25e83
 
c427a87
6d25e83
c427a87
6d25e83
c427a87
6d25e83
c427a87
6d25e83
c427a87
6d25e83
c427a87
 
 
 
 
6d25e83
c427a87
6d25e83
c427a87
 
 
6d25e83
c427a87
 
 
6d25e83
c427a87
 
 
 
6d25e83
c427a87
 
 
 
 
 
6d25e83
c427a87
 
 
6d25e83
c427a87
6d25e83
c427a87
6d25e83
c427a87
6d25e83
c427a87
6d25e83
c427a87
6d25e83
c427a87
6d25e83
c427a87
6d25e83
c427a87
 
6d25e83
c427a87
 
6d25e83
c427a87
 
6d25e83
c427a87

---
language:
- en
- fr
- de
- es
- it
- pt
- ja
- ko
- zh
- ar
library_name: transformers
---

# Model Card for C4AI Command-R quantized to 4bit

## Model Summary

This repo contains a 4bit quantized version of C4AI Command-R. 

A 35 billion parameter highly performant generative model. Command-R is a large language model with open weights optimized for a variety of use cases including reasoning, summarization, and question answering. Command-R has the capability for multilingual generation evaluated in 10 languages and highly performant RAG capabilities.

Developed by: Cohere and [Cohere For AI](https://cohere.for.ai)

- Point of Contact: Cohere For AI: [cohere.for.ai](https://cohere.for.ai/)
- License: [CC-BY-NC](https://cohere.com/c4ai-cc-by-nc-license), requires also adhering to [C4AI's Acceptable Use Policy](https://docs.cohere.com/docs/c4ai-acceptable-use-policy)
- Model: c4ai-command-r-v01
- Model Size: 35 billion parameters
- Context length: 128K

**Use**

```python
# pip install transformers
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "prince-canuma/c4ai-command-r-v01-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)

# Format message with the command-r chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

gen_tokens = model.generate(
    input_ids, 
    max_new_tokens=100, 
    do_sample=True, 
    temperature=0.3,
    )

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
```

## Model Details

**Input**: Models input text only.

**Output**: Models generate text only.

**Model Architecture**: This is an auto-regressive language model that uses an optimized transformer architecture. After pretraining, this model uses supervised fine-tuning (SFT) and preference training to align model behavior to human preferences for helpfulness and safety.

**Languages covered**: The model is optimized to perform well in the following languages: English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic. 

Pre-training data additionally included the following 13 languages: Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, Persian.

**Context length**: Command-R supports a context length of 128K.

### Code Capabilities:
Command-R has been optimized to interact with your code, by requesting code snippets, code explanations, or code rewrites. It might not perform well out-of-the-box for pure code completion. For better performance, we also recommend using a low temperature (and even greedy decoding) for code-generation related instructions.

### Model Card Contact
For errors or additional questions about details in this model card, contact [info@for.ai](mailto:info@for.ai).

### Terms of Use: 
We hope that the release of this model will make community-based research efforts more accessible, by releasing the weights of a highly performant 35 billion parameter model to researchers all over the world. This model is governed by a [CC-BY-NC](https://cohere.com/c4ai-cc-by-nc-license) License with an acceptable use addendum, and also requires adhering to [C4AI's Acceptable Use Policy](https://docs.cohere.com/docs/c4ai-acceptable-use-policy).

### Try Chat:
You can try Command-R chat in the playground [here](https://dashboard.cohere.com/playground/chat).