Model: FalconMasr
This model is based on the Falcon-7B model with quantization in 4-bit format for efficient memory usage and fine-tuned using LoRA (Low-Rank Adaptation) for Arabic causal language modeling tasks. The model has been configured to handle causal language modeling tasks specifically designed to improve responses in Arabic.
Model Configuration
- Base Model:
ybelkada/falcon-7b-sharded-bf16
- Quantization: 4-bit with
nf4
quantization type andfloat16
computation - LoRA Configuration:
lora_alpha=16
,lora_dropout=0
,r=64
- Task Type: Causal Language Modeling
- Target Modules:
query_key_value
,dense
,dense_h_to_4h
,dense_4h_to_h
Training
The model was fine-tuned on a custom Arabic text dataset, achieving progressive improvements in training loss, as shown in the table below:
Step | Training Loss |
---|---|
10 | 1.459100 |
20 | 1.335000 |
30 | 1.295600 |
40 | 1.177000 |
50 | 1.144900 |
60 | 1.132900 |
70 | 1.074500 |
80 | 1.078600 |
90 | 1.121100 |
100 | 0.936000 |
110 | 1.151500 |
120 | 1.068000 |
130 | 1.056700 |
140 | 0.976900 |
150 | 0.867300 |
160 | 1.151100 |
170 | 1.023200 |
180 | 1.074300 |
190 | 1.036800 |
200 | 0.930700 |
210 | 0.960800 |
220 | 1.098800 |
230 | 0.967400 |
240 | 0.961700 |
250 | 0.871100 |
260 | 0.869400 |
270 | 0.939500 |
280 | 1.087600 |
290 | 1.080700 |
300 | 0.906800 |
310 | 0.901600 |
320 | 0.943200 |
330 | 0.968900 |
340 | 0.986600 |
350 | 1.014200 |
360 | 1.191700 |
370 | 0.992500 |
380 | 0.963600 |
390 | 0.888800 |
400 | 0.746000 |
Usage
To use this model, load it with the following configuration:
import torch
from transformers import AutoModelForCausalLM,BitsAndBytesConfig
from transformers import AutoTokenizer
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)
# Model Configuration
model_name ="MahmoudIbrahim/FalconMasr"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
)
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
model_name,
quantization_config=bnb_config,
trust_remote_code=True,
low_cpu_mem_usage=True,
)
model.config.use_cache = False
tokenizer =AutoTokenizer.from_pretrained(
model_name,
trust_remote_code=True,
)
tokenizer.pad_token = tokenizer.eos_token
input_text = "كيف تختلف منصة المدفوعات المتكاملة لشركة أمريكان إكسبريس عن شبكات البطاقات المصرفية؟"
# Move inputs to the same device as the model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Set use_reentrant=False for torch checkpointing
torch.utils.checkpoint.checkpoint_sequential.use_reentrant = False
# Tokenize the input text
inputs = tokenizer(input_text, return_tensors="pt").to(device)
# Remove 'token_type_ids' if it's present in the inputs
inputs.pop('token_type_ids', None)
# Generate the output
output = model.generate(**inputs, max_length=200,
use_cache=False,pad_token_id=tokenizer.eos_token_id)
# Decode the generated output
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
print(decoded_output)
- Downloads last month
- 20
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for MahmoudIbrahim/FalconMasr
Base model
ybelkada/falcon-7b-sharded-bf16