|
--- |
|
license: other |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
inference: false |
|
tags: |
|
- transformers |
|
- gguf |
|
- imatrix |
|
- c4ai-command-r-v01 |
|
--- |
|
Quantizations of https://huggingface.co/CohereForAI/c4ai-command-r-v01 |
|
|
|
# From original readme |
|
|
|
**Usage** |
|
|
|
Please use `transformers` version 4.39.1 or higher |
|
```python |
|
# pip install 'transformers>=4.39.1' |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
model_id = "CohereForAI/c4ai-command-r-v01" |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained(model_id) |
|
|
|
# Format message with the command-r chat template |
|
messages = [{"role": "user", "content": "Hello, how are you?"}] |
|
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt") |
|
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|> |
|
|
|
gen_tokens = model.generate( |
|
input_ids, |
|
max_new_tokens=100, |
|
do_sample=True, |
|
temperature=0.3, |
|
) |
|
|
|
gen_text = tokenizer.decode(gen_tokens[0]) |
|
print(gen_text) |
|
``` |
|
|
|
**Quantized model through bitsandbytes, 8-bit precision** |
|
|
|
```python |
|
# pip install 'transformers>=4.39.1' bitsandbytes accelerate |
|
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig |
|
|
|
bnb_config = BitsAndBytesConfig(load_in_8bit=True) |
|
|
|
model_id = "CohereForAI/c4ai-command-r-v01" |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config) |
|
|
|
# Format message with the command-r chat template |
|
messages = [{"role": "user", "content": "Hello, how are you?"}] |
|
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt") |
|
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|> |
|
|
|
gen_tokens = model.generate( |
|
input_ids, |
|
max_new_tokens=100, |
|
do_sample=True, |
|
temperature=0.3, |
|
) |
|
|
|
gen_text = tokenizer.decode(gen_tokens[0]) |
|
print(gen_text) |
|
``` |