Edit model card

Aya Sl Biz 8B

This is a GGUF format quantized version of a fine-tuned CohereForAI/aya-23-8B model.

Model Details

  • Original Model: CohereForAI/aya-23-8B
  • Quantization Type: Q4_K_M
  • Format: GGUF
  • Conversion Date: 2024-10-31
  • Framework: llama.cpp

Usage

This model can be used with llama.cpp. Here's how to use it:

# Basic usage
./llama-cli -m path_to_model.gguf -n 512 --prompt "Your prompt here"

# Chat format
./llama-cli -m path_to_model.gguf --temp 0.7 --repeat-penalty 1.2 -n 512 --prompt "<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>You are Command-R, a helpful AI assistant.<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Your prompt here<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>"

Quantization Details

This model was quantized using the Q4_K_M format, which offers a good balance between model size and performance. The quantization was performed using llama.cpp's quantization tools.

Original model size: ~16GB Quantized model size: ~4.7GB

License

This model is released under the Apache 2.0 license.

Downloads last month
24
GGUF
Model size
8.03B params
Architecture
command-r

4-bit

Inference Examples
Inference API (serverless) does not yet support llama.cpp models for this pipeline type.