Upload README.md with huggingface_hub
Browse files
README.md
ADDED
@@ -0,0 +1,48 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
tags:
|
5 |
+
- llama.cpp
|
6 |
+
- gguf
|
7 |
+
- aya
|
8 |
+
- cohere
|
9 |
+
- quantized
|
10 |
+
library_name: llama.cpp
|
11 |
+
pipeline_tag: text-generation
|
12 |
+
license: apache-2.0
|
13 |
+
---
|
14 |
+
|
15 |
+
# Aya Sl Biz 8B
|
16 |
+
|
17 |
+
This is a GGUF format quantized version of a fine-tuned CohereForAI/aya-23-8B model.
|
18 |
+
|
19 |
+
## Model Details
|
20 |
+
|
21 |
+
- **Original Model:** CohereForAI/aya-23-8B
|
22 |
+
- **Quantization Type:** Q4_K_M
|
23 |
+
- **Format:** GGUF
|
24 |
+
- **Conversion Date:** 2024-10-31
|
25 |
+
- **Framework:** llama.cpp
|
26 |
+
|
27 |
+
## Usage
|
28 |
+
|
29 |
+
This model can be used with [llama.cpp](https://github.com/ggerganov/llama.cpp). Here's how to use it:
|
30 |
+
|
31 |
+
```bash
|
32 |
+
# Basic usage
|
33 |
+
./llama-cli -m path_to_model.gguf -n 512 --prompt "Your prompt here"
|
34 |
+
|
35 |
+
# Chat format
|
36 |
+
./llama-cli -m path_to_model.gguf --temp 0.7 --repeat-penalty 1.2 -n 512 --prompt "<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>You are Command-R, a helpful AI assistant.<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Your prompt here<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>"
|
37 |
+
```
|
38 |
+
|
39 |
+
## Quantization Details
|
40 |
+
|
41 |
+
This model was quantized using the Q4_K_M format, which offers a good balance between model size and performance. The quantization was performed using llama.cpp's quantization tools.
|
42 |
+
|
43 |
+
Original model size: ~16GB
|
44 |
+
Quantized model size: ~4.7GB
|
45 |
+
|
46 |
+
## License
|
47 |
+
|
48 |
+
This model is released under the Apache 2.0 license.
|