duyntnet
/

c4ai-command-r-v01-imatrix-GGUF

Text Generation

c4ai-command-r-v01

Model card Files Files and versions Community

c4ai-command-r-v01-imatrix-GGUF / README.md

duyntnet's picture

Upload README.md

7a1098c verified 4 months ago

|

No virus

2.1 kB

	---
	license: other
	language:
	- en
	pipeline_tag: text-generation
	inference: false
	tags:
	- transformers
	- gguf
	- imatrix
	- c4ai-command-r-v01
	---
	Quantizations of https://huggingface.co/CohereForAI/c4ai-command-r-v01

	# From original readme

	Usage

	Please use `transformers` version 4.39.1 or higher
	```python
	# pip install 'transformers>=4.39.1'
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model_id = "CohereForAI/c4ai-command-r-v01"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(model_id)

	# Format message with the command-r chat template
	messages = [{"role": "user", "content": "Hello, how are you?"}]
	input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
	## <BOS_TOKEN><\|START_OF_TURN_TOKEN\|><\|USER_TOKEN\|>Hello, how are you?<\|END_OF_TURN_TOKEN\|><\|START_OF_TURN_TOKEN\|><\|CHATBOT_TOKEN\|>

	gen_tokens = model.generate(
	input_ids,
	max_new_tokens=100,
	do_sample=True,
	temperature=0.3,
	)

	gen_text = tokenizer.decode(gen_tokens[0])
	print(gen_text)
	```

	Quantized model through bitsandbytes, 8-bit precision

	```python
	# pip install 'transformers>=4.39.1' bitsandbytes accelerate
	from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

	bnb_config = BitsAndBytesConfig(load_in_8bit=True)

	model_id = "CohereForAI/c4ai-command-r-v01"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config)

	# Format message with the command-r chat template
	messages = [{"role": "user", "content": "Hello, how are you?"}]
	input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
	## <BOS_TOKEN><\|START_OF_TURN_TOKEN\|><\|USER_TOKEN\|>Hello, how are you?<\|END_OF_TURN_TOKEN\|><\|START_OF_TURN_TOKEN\|><\|CHATBOT_TOKEN\|>

	gen_tokens = model.generate(
	input_ids,
	max_new_tokens=100,
	do_sample=True,
	temperature=0.3,
	)

	gen_text = tokenizer.decode(gen_tokens[0])
	print(gen_text)
	```