---
library_name: peft
license: apache-2.0
datasets:
- Yasbok/Alpaca_arabic_instruct
language:
- ar
pipeline_tag: text-generation
---

# 🚀 Falcon-7b-QLoRA-alpaca-arabic
This repo contains a low-rank adapter for Falcon-7b fit on the Stanford Alpaca dataset Arabic version [Yasbok/Alpaca_arabic_instruct](https://huggingface.co/datasets/Yasbok/Alpaca_arabic_instruct).

## Model Summary

- **Model Type:** Causal decoder-only
- **Language(s):** Arabic
- **Base Model:** [Falcon-7B](https://huggingface.co/tiiuae/falcon-7b) (License: [Apache 2.0](https://huggingface.co/tiiuae/falcon-7b#license))
- **Dataset:** [Yasbok/Alpaca_arabic_instruct](https://huggingface.co/datasets/Yasbok/Alpaca_arabic_instruct)
- **License(s):** Apache 2.0 inherited from "Base Model" 

## Model Details

The model was fine-tuned in 8-bit precision using 🤗 `peft` adapters, `transformers`, and `bitsandbytes`. Training relied on a method called QLoRA introduced in this [paper](https://arxiv.org/abs/2305.14314). The run took approximately 3 hours and was executed on a workstation with a single A100-SXM NVIDIA GPU with 37 GB of available memory. 

### Model Date

June 10, 2023

### Recommendations

We recommend users of this model to develop guardrails and to take appropriate precautions for any production use.

## How to Get Started with the Model

### Setup
```python
# Install packages
!pip install -q -U bitsandbytes loralib einops
!pip install -q -U git+https://github.com/huggingface/transformers.git 
!pip install -q -U git+https://github.com/huggingface/peft.git
!pip install -q -U git+https://github.com/huggingface/accelerate.git
```

### GPU Inference in 8-bit

This requires a GPU with at least 12 GB of memory.

### First, Load the Model

```python
import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

# load the model
peft_model_id = "Ali-C137/falcon-7b-chat-alpaca-arabic"
config = PeftConfig.from_pretrained(peft_model_id)

model = AutoModelForCausalLM.from_pretrained(
    config.base_model_name_or_path,
    return_dict=True,
    device_map={"":0},
    trust_remote_code=True,
    load_in_8bit=True,
)

tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
tokenizer.pad_token = tokenizer.eos_token

model = PeftModel.from_pretrained(model, peft_model_id)
```

### CUDA Info

- CUDA Version: 12.0
- Hardware: 1 A100-SXM
- Max Memory: {0: "37GB"}
- Device Map: {"": 0}

### Package Versions Employed

- `torch`: 2.0.1+cu118
- `transformers`: 4.30.0.dev0
- `peft`: 0.4.0.dev0
- `accelerate`: 0.19.0
- `bitsandbytes`: 0.39.0
- `einops`: 0.6.1

### This work is highly inspired from [Daniel Furman](https://huggingface.co/dfurman)'s work, so Thanks a lot Daniel