--- library_name: transformers tags: [] --- # Model Card for C4AI Command-R ## Model Summary C4AI Command-R is a research release of a 35 billion parameter highly performant generative model. Command-R is a large language model with open weights optimized for a variety of use cases including reasoning, summarization, and question answering. Command-R has the capability for multilingual generation evaluated in 10 languages and highly performant RAG capabilities. Developed by: Cohere and [Cohere For AI](https://cohere.for.ai) - Point of Contact: Cohere For AI: [cohere.for.ai](https://cohere.for.ai/) - License: [CC-BY-NC](https://cohere.com/c4ai-cc-by-nc-license), requires also adhering to [C4AI's Acceptable Use Policy](https://docs.cohere.com/docs/c4ai-acceptable-use-policy) - Model: c4ai-command-r-0.1 - Model Size: 35 billion parameters - Context length: 128K **Use** ```python # pip install transformers from transformers import AutoTokenizer, AutoModelForCausalLM model_id = "CohereForAI/c4ai-command-r-v01" tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True) # Format message with the command-r chat template messages = [{"role": "user", "content": "Hello, how are you?"}] input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True) ## <|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|> gen_tokens = model.generate( input_ids, do_sample=True, temperature=0.3, max_length=100, ) gen_text = tokenizer.decode(gen_tokens[0]) print(gen_text) ``` ## Model Details **Input**: Models input text only. **Output**: Models generate text only. **Model Architecture**: This is an auto-regressive language model that uses an optimized transformer architecture. After pretraining, this model uses supervised fine-tuning (SFT) and preference training to align model behavior to human preferences for helpfulness and safety. **Languages covered**: The model is optimized to perform well in the following languages: English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic. Pre-training data additionally included the following 13 languages: Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, Persian. **Context length**: Command-R supports a context length of 128K. ### Tool use capabilities: Command-R has been specifically trained with conversational tool use capabilities. These have been trained into the model via a mixture of supervised fine-tuning and preference fine-tuning, using a specific prompt template. Deviating from this prompt template will likely reduce performance, but we encourage experimentation. Command-R’s Tool use functionality takes a conversation as input (with an optional user-system preamble), along with a list of available tools. The model will then generate a json-formatted list of actions to execute on a subset of those tools. Command-R may use one of its supplied tools more than once. The model has been trained to recognise a special `directly_answer` tool, which it uses to indicate that it doesn’t want to use any of its other tools. We recommend including the directly_answer tool, but encourage experimentation. Comprehensive documentation and guides on prompting strategies for tool use will be provided shortly. ### Grounded Generation and RAG Capabilities: Command-R has been specifically trained with grounded generation capabilities. This means that it can generate responses based on a list of supplied document snippets, and it will include grounding spans (citations) in its response indicating the source of the information. This can be used to enable behaviours such as grounded summarization and the final step of Retrieval Augmented Generation (RAG). This behavior has been trained into the model via a mixture of supervised fine-tuning and preference fine-tuning, using a specific prompt template. Deviating from this prompt template may reduce performance, but we encourage experimentation. Command-R’s Grounded Generation behavior takes a conversation as input (with an optional user-supplied system preamble), along with a list of retrieved document snippets. The document snippets should be chunks, rather than long documents, typically around 100-400 words per chunk. Document snippets consist of key-value pairs. The Keys should be short descriptive strings, the values can be text or semi-structured. Comprehensive documentation and guides on prompting strategies on grounded generation will be provided shortly. ### Code Capabilities: Command-R has been optimized to interact with your code, by requesting code snippets, code explanations, or code rewrites. It might not perform well out-of-the-box for pure code completion. For better performance, we also recommend using a low temperature (and even greedy decoding) for code-generation related instructions. ### Model Card Contact For errors or additional questions about details in this model card, contact [info@for.ai](mailto:info@for.ai). ### Terms of Use: We hope that the release of this model will make community-based research efforts more accessible, by releasing the weights of a highly performant 35 billion parameter model to researchers all over the world. This model is governed by a [CC-BY-NC](https://cohere.com/c4ai-cc-by-nc-license) License with an acceptable use addendum, and also requires adhering to [C4AI's Acceptable Use Policy](https://docs.cohere.com/docs/c4ai-acceptable-use-policy).