🇪🇬 Egypt LLM Fine-Tune

Model ID: quantumsquatan/egypt-llm-finetune

This is an educational LoRA fine-tune of TinyLlama created while learning from the Hugging Face LLM Course in Egypt. It is designed for simple Computer Science, AI, programming, and student-life explanations in English with Egyptian Arabic context.

Model Details

Model Description

  • Developed by: quantumsquatan
  • Shared by: quantumsquatan
  • Model type: PEFT LoRA adapter for causal language modeling
  • Base model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
  • Language(s): English and Arabic/Egyptian Arabic context
  • License: Apache 2.0
  • Fine-tuning method: Supervised fine-tuning (SFT) with LoRA
  • PEFT type: LoRA
  • LoRA rank: 16
  • LoRA alpha: 32
  • LoRA dropout: 0.05
  • Target modules: q_proj, v_proj
  • Task type: CAUSAL_LM
  • Tokenizer max length: 2048
  • Adapter size: about 4.5 MB
  • Funding: No external funding reported

Model Sources

Uses

Direct Use

This model can be used as a lightweight educational chat model for:

  • Explaining Computer Science concepts in simple language
  • Explaining basic AI and machine learning ideas
  • Helping students understand programming and technical study topics
  • Casual English/Arabic conversation with Egyptian student context
  • Demonstrating how LoRA fine-tuning works on a small model

Downstream Use

This adapter can be loaded on top of TinyLlama/TinyLlama-1.1B-Chat-v1.0 in educational demos, notebooks, small experiments, and student projects. It can also be used as a starting point for further fine-tuning on larger, higher-quality Arabic or CS-focused instruction datasets.

Out-of-Scope Use

This model should not be used for:

  • Legal, medical, financial, safety-critical, or high-stakes decisions
  • Production systems that require factual guarantees
  • Replacing teachers, domain experts, lawyers, doctors, or professional advisors
  • Generating harmful, deceptive, or abusive content
  • Claims of broad Arabic mastery or expert-level Egyptian dialect understanding

Bias, Risks, and Limitations

This is a first educational fine-tune trained on a small custom dataset of 41 examples. Because the dataset is small, the model can overfit, hallucinate, repeat patterns, or answer outside its training scope. The base model is also small, so reasoning quality, factual accuracy, Arabic fluency, and instruction following may be limited.

Known limitations:

  • Small training dataset
  • No published benchmark evaluation
  • Not a legal, medical, or professional advice model
  • May produce incorrect or outdated information
  • May mix English and Arabic unexpectedly
  • May reflect biases from the base model and the custom examples

Recommendations

Use this model for learning, demos, and experimentation. Verify important information with reliable sources. For better results, evaluate outputs manually, add a larger dataset, and test with held-out prompts before any serious downstream use.

How to Get Started with the Model

This repository contains a PEFT LoRA adapter. Load it with the base model and tokenizer:

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from peft import PeftModel

base_model_id = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
adapter_id = "quantumsquatan/egypt-llm-finetune"

tokenizer = AutoTokenizer.from_pretrained(adapter_id)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, adapter_id)

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer
)

messages = [
    {"role": "user", "content": "Explain neural networks simply."}
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

print(pipe(prompt, max_new_tokens=200, do_sample=True, temperature=0.7)[0]["generated_text"])

Training Details

Training Data

The model was trained on 41 custom examples focused on:

  • Computer Science explanations
  • AI and machine learning explanations
  • Programming learning support
  • Student life in Egypt
  • General educational conversation
  • English with Egyptian Arabic context

The dataset is not published as a separate Hugging Face dataset card in this repository.

Training Procedure

The model was fine-tuned using supervised fine-tuning with PEFT LoRA adapters on top of TinyLlama. The adapter targets attention projection modules q_proj and v_proj.

Preprocessing

The examples were formatted as chat/instruction-style text for causal language modeling. Tokenization uses the TinyLlama/Llama tokenizer with a maximum model length of 2048 tokens.

Training Hyperparameters

Published adapter configuration:

  • Training regime: LoRA supervised fine-tuning for causal language modeling
  • PEFT version: 0.19.1
  • LoRA rank (r): 16
  • LoRA alpha: 32
  • LoRA dropout: 0.05
  • Bias: none
  • Target modules: q_proj, v_proj
  • Task type: CAUSAL_LM
  • Base model: TinyLlama/TinyLlama-1.1B-Chat-v1.0

Other run-specific trainer settings such as exact epoch count, batch size, optimizer, hardware, and wall-clock training time were not published in the model repository.

Speeds, Sizes, Times

  • Adapter checkpoint size: about 4.5 MB
  • Base model size: inherited from TinyLlama 1.1B
  • Training time: not published

Evaluation

Testing Data, Factors, and Metrics

Testing Data

No formal held-out test set was published. The model should be evaluated manually with prompts covering CS, AI, programming, English, Arabic, and mixed English-Arabic use.

Factors

Recommended evaluation factors:

  • English CS explanation quality
  • Arabic/Egyptian Arabic clarity
  • Hallucination rate
  • Helpfulness for beginner students
  • Instruction following
  • Safety and refusal behavior on out-of-scope prompts

Metrics

No benchmark metrics are published. Suggested future metrics include human preference ratings, exactness on small CS QA tests, hallucination checks, and qualitative comparison against the base TinyLlama model.

Results

No formal benchmark results are published yet.

Summary

This model is best understood as a learning milestone and a compact PEFT adapter experiment, not as a production-grade assistant.

Model Examination

No interpretability or internal model examination has been published.

Environmental Impact

Formal carbon accounting was not recorded. Because this is a small LoRA adapter fine-tune on TinyLlama, the compute footprint is expected to be much lower than full-model fine-tuning, but exact emissions cannot be claimed without hardware and runtime logs.

  • Hardware type: not published
  • Hours used: not published
  • Cloud provider: not published
  • Compute region: not published
  • Carbon emitted: not calculated

Technical Specifications

Model Architecture and Objective

The base architecture is TinyLlama, a Llama-style causal decoder-only language model. This repository provides a LoRA adapter trained for next-token prediction on instruction/chat-style educational examples.

Compute Infrastructure

The exact training infrastructure is not published in the repository.

Hardware

Not published.

Software

Known software stack from repository metadata:

  • Transformers
  • PEFT 0.19.1
  • TRL
  • Safetensors
  • TinyLlama tokenizer/model family

Citation

No paper has been published for this fine-tune. If you use it, cite the repository:

@misc{quantumsquatan_egypt_llm_finetune_2026,
  title = {Egypt LLM Fine-Tune},
  author = {quantumsquatan},
  year = {2026},
  url = {https://huggingface.co/quantumsquatan/egypt-llm-finetune}
}

Glossary

  • LoRA: Low-Rank Adaptation, a parameter-efficient fine-tuning method.
  • PEFT: Parameter-Efficient Fine-Tuning.
  • SFT: Supervised Fine-Tuning.
  • Adapter: A small set of trainable weights loaded on top of a base model.

More Information

This model is part of a learning project around Hugging Face, LLM fine-tuning, and AI education in Egypt.

Model Card Authors

quantumsquatan, with README/model-card structuring assistance.

Model Card Contact

Use the Hugging Face model page or GitHub repository issues for contact.

Framework Versions

  • PEFT 0.19.1
Downloads last month
26
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for quantumsquatan/egypt-llm-finetune

Adapter
(1528)
this model