--- inference: false library_name: transformers language: - en - fr - de - es - it - pt - ja - ko - zh - ar - el - fa - pl - id - cs - he - hi - nl - ro - ru - tr - uk - vi license: cc-by-nc-4.0 --- # Making AI Models Faster, Smaller, and Cleaner! 🥝 **Contact me** if you have suggestions for models to compress next or if you want to compress your own models! **Reduction:** We reduced the Aya's size from 11 GB to 7GB, as well as reducing inference cost and gpu usage. **How does the compression work?** The model is compressed using Quanto to 8 bits. **How does the model quality change?** Quality might vary compared to the base model. **How is the model efficiency evaluated?** Results are based on testing with specific hardware and configurations. Efficiency may vary with different settings. **What is the model format?** We use safetensors. **What calibration data was used?** Calibration data includes WikiText, if needed by the compression method. **How to compress your own models?** Request access for more compression methods and support: [Request Access](#) ### About Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. Aya 23 focuses on pairing a highly performant pre-trained [Command family](https://huggingface.co/CohereForAI/c4ai-command-r-plus) of models with the recently released [Aya Collection](https://huggingface.co/datasets/CohereForAI/aya_collection). The result is a powerful multilingual large language model serving 23 languages. This model card corresponds to the 8-billion version of the Aya 23 model. We also released a 35-billion version which you can find [here](https://huggingface.co/CohereForAI/aya-23-35B). We cover 23 languages: Arabic, Chinese (simplified & traditional), Czech, Dutch, English, French, German, Greek, Hebrew, Hindi, Indonesian, Italian, Japanese, Korean, Persian, Polish, Portuguese, Romanian, Russian, Spanish, Turkish, Ukrainian, and Vietnamese Developed by: [Cohere For AI](https://cohere.for.ai) and [Cohere](https://cohere.com/) - Point of Contact: Cohere For AI: [cohere.for.ai](https://cohere.for.ai/) - License: [CC-BY-NC](https://cohere.com/c4ai-cc-by-nc-license), requires also adhering to [C4AI's Acceptable Use Policy](https://docs.cohere.com/docs/c4ai-acceptable-use-policy) - Model: aya-23-8B - Model Size: 8 billion parameters ### Usage Please install transformers from the source repository that includes the necessary changes for this model ```python # pip install transformers==4.41.1 from transformers import AutoTokenizer, AutoModelForCausalLM model_id = "CohereForAI/aya-23-8B" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id) # Format message with the command-r-plus chat template messages = [{"role": "user", "content": "Anneme onu ne kadar sevdiğimi anlatan bir mektup yaz"}] input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt") ## <|START_OF_TURN_TOKEN|><|USER_TOKEN|>Anneme onu ne kadar sevdiğimi anlatan bir mektup yaz<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|> gen_tokens = model.generate( input_ids, max_new_tokens=100, do_sample=True, temperature=0.3, ) gen_text = tokenizer.decode(gen_tokens[0]) print(gen_text) ``` ## Configurations Configuration details are in `config.json`. ## Credits & License The license of the model follows the original model’s license. Check the license of the base model for details. **Want to suggest or compress other models?** Send me a message!