---
base_model: unsloth/meta-llama-3.1-8b-bnb-4bit
language:
- en
- yo
- zu
- xh
- wo
- fr
- ig
- ha
- am
- ar
- so
- sw
- sn
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
datasets:
- vutuka/aya_african_alpaca
pipeline_tag: text-generation
---

# Llama-3.1-8B-african-aya

- **Developed by:** vutuka
- **License:** apache-2.0
- **Finetuned from model :** unsloth/meta-llama-3.1-8b-bnb-4bit

This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.


## LlamaCPP Code

```sh
CMAKE_ARGS="-DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS" \
  pip install llama-cpp-python
````

```py
from huggingface_hub import hf_hub_download
from llama_cpp import Llama

## Download the GGUF model
model_name = "vutuka/Llama-3.1-8B-african-aya"
model_file = "llama-3.1-8B-african-aya.Q8_0.gguf"
model_path = hf_hub_download(model_name, filename=model_file)

## Instantiate model from downloaded file
llm = Llama(
  model_path=model_path,
  n_ctx=4096,
  n_gpu_layers=-1,
  n_batch=512,
  verbose=False,
)

## Run inference
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

prompt = alpaca_prompt.format(
  "",
  "Àwọn ajínigbé méjì ni wọ́n mú ní Supare Akoko, ṣàlàyé ìtàn náà.",
  "",
)

res = llm(prompt) # Res is a dictionary

## Unpack and the generated text from the LLM response dictionary and print it
print(res["choices"][0]["text"])
# res is short for result
```

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)