|
--- |
|
language: |
|
- bn |
|
license: apache-2.0 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
- trl |
|
base_model: unsloth/llama-3-8b-bnb-4bit |
|
inference: false |
|
--- |
|
|
|
# LLama-3 Bangla 4 bit |
|
|
|
<div align="center"> |
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/65ca6f0098a46a56261ac3ac/O1ATwhQt_9j59CSIylrVS.png" width="300"/> |
|
|
|
</div> |
|
|
|
- **Developed by:** KillerShoaib |
|
- **License:** apache-2.0 |
|
- **Finetuned from model :** unsloth/llama-3-8b-bnb-4bit |
|
- **Datset used for fine-tuning :** iamshnoo/alpaca-cleaned-bengali |
|
|
|
|
|
# 4-bit Quantization |
|
**This is 4-bit quantization of Llama-3 8b model.** |
|
|
|
|
|
# Llama-3 Bangla Different Formats |
|
|
|
- `LoRA Adapters only` - [**KillerShoaib/llama-3-8b-bangla-lora**](https://huggingface.co/KillerShoaib/llama-3-8b-bangla-lora) |
|
- `GGUF q4_k_m` - [**KillerShoaib/llama-3-8b-bangla-GGUF-Q4_K_M**](https://huggingface.co/KillerShoaib/llama-3-8b-bangla-GGUF-Q4_K_M) |
|
|
|
# Model Details |
|
|
|
**Llama 3 8 billion** model was finetuned using **unsloth** package on a **cleaned Bangla alpaca** dataset. After that the model was quantized in **4-bit**. The model is finetuned for **2 epoch** on a single T4 GPU. |
|
|
|
|
|
# Pros & Cons of the Model |
|
|
|
## Pros |
|
|
|
- **The model can comprehend the Bangla language, including its semantic nuances** |
|
- **Given context model can answer the question based on the context** |
|
|
|
## Cons |
|
- **Model is unable to do creative or complex work. i.e: creating a poem or solving a math problem in Bangla** |
|
- **Since the size of the dataset was small, the model lacks lot of general knowledge in Bangla** |
|
|
|
|
|
# Run The Model |
|
|
|
## FastLanguageModel from unsloth for 2x faster inference |
|
|
|
```python |
|
|
|
from unsloth import FastLanguageModel |
|
model, tokenizer = FastLanguageModel.from_pretrained( |
|
model_name = "KillerShoaib/llama-3-8b-bangla-4bit", |
|
max_seq_length = 2048, |
|
dtype = None, |
|
load_in_4bit = True, |
|
) |
|
FastLanguageModel.for_inference(model) |
|
|
|
# alpaca_prompt for the model |
|
alpaca_prompt = """Below is an instruction in bangla that describes a task, paired with an input also in bangla that provides further context. Write a response in bangla that appropriately completes the request. |
|
|
|
### Instruction: |
|
{} |
|
|
|
### Input: |
|
{} |
|
|
|
### Response: |
|
{}""" |
|
|
|
# input with instruction and input |
|
inputs = tokenizer( |
|
[ |
|
alpaca_prompt.format( |
|
"সুস্থ থাকার তিনটি উপায় বলুন", # instruction |
|
"", # input |
|
"", # output - leave this blank for generation! |
|
) |
|
], return_tensors = "pt").to("cuda") |
|
|
|
# generating the output and decoding it |
|
outputs = model.generate(**inputs, max_new_tokens = 2048, use_cache = True) |
|
tokenizer.batch_decode(outputs) |
|
``` |
|
|
|
## AutoModelForCausalLM from Hugginface |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
model_name = "KillerShoaib/llama-3-8b-bangla-4bit" # YOUR MODEL YOU USED FOR TRAINING either hf hub name or local folder name. |
|
tokenizer_name = model_name |
|
|
|
# Load tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained(tokenizer_name) |
|
# Load model |
|
model = AutoModelForCausalLM.from_pretrained(model_name) |
|
|
|
alpaca_prompt = """Below is an instruction in bangla that describes a task, paired with an input also in bangla that provides further context. Write a response in bangla that appropriately completes the request. |
|
|
|
### Instruction: |
|
{} |
|
|
|
### Input: |
|
{} |
|
|
|
### Response: |
|
{}""" |
|
|
|
inputs = tokenizer( |
|
[ |
|
alpaca_prompt.format( |
|
"সুস্থ থাকার তিনটি উপায় বলুন", # instruction |
|
"", # input |
|
"", # output - leave this blank for generation! |
|
) |
|
], return_tensors = "pt").to("cuda") |
|
|
|
outputs = model.generate(**inputs, max_new_tokens = 1024, use_cache = True) |
|
tokenizer.batch_decode(outputs) |
|
``` |
|
|
|
# Inference Script & Github Repo |
|
|
|
- `Google Colab` - [**Llama-3 8b Bangla Inference Script**](https://colab.research.google.com/drive/1jZaDmmamOoFiy-ZYRlbfwU0HaP3S48ER?usp=sharing) |
|
- `Github Repo` - [**Llama-3 Bangla**](https://github.com/KillerShoaib/Llama-3-Bangla) |