Instructions to use quantumsquatan/egypt-llm-finetune with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use quantumsquatan/egypt-llm-finetune with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0") model = PeftModel.from_pretrained(base_model, "quantumsquatan/egypt-llm-finetune") - Transformers
How to use quantumsquatan/egypt-llm-finetune with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="quantumsquatan/egypt-llm-finetune") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("quantumsquatan/egypt-llm-finetune", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use quantumsquatan/egypt-llm-finetune with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "quantumsquatan/egypt-llm-finetune" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "quantumsquatan/egypt-llm-finetune", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/quantumsquatan/egypt-llm-finetune
- SGLang
How to use quantumsquatan/egypt-llm-finetune with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "quantumsquatan/egypt-llm-finetune" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "quantumsquatan/egypt-llm-finetune", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "quantumsquatan/egypt-llm-finetune" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "quantumsquatan/egypt-llm-finetune", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use quantumsquatan/egypt-llm-finetune with Docker Model Runner:
docker model run hf.co/quantumsquatan/egypt-llm-finetune
🇪🇬 Egypt LLM Fine-Tune
Model ID: quantumsquatan/egypt-llm-finetune
This is an educational LoRA fine-tune of TinyLlama created while learning from the Hugging Face LLM Course in Egypt. It is designed for simple Computer Science, AI, programming, and student-life explanations in English with Egyptian Arabic context.
Model Details
Model Description
- Developed by: quantumsquatan
- Shared by: quantumsquatan
- Model type: PEFT LoRA adapter for causal language modeling
- Base model:
TinyLlama/TinyLlama-1.1B-Chat-v1.0 - Language(s): English and Arabic/Egyptian Arabic context
- License: Apache 2.0
- Fine-tuning method: Supervised fine-tuning (SFT) with LoRA
- PEFT type: LoRA
- LoRA rank: 16
- LoRA alpha: 32
- LoRA dropout: 0.05
- Target modules:
q_proj,v_proj - Task type:
CAUSAL_LM - Tokenizer max length: 2048
- Adapter size: about 4.5 MB
- Funding: No external funding reported
Model Sources
- Hugging Face model: quantumsquatan/egypt-llm-finetune
- GitHub repository: omar-rr/egypt-llm-finetune
- Base model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
- Paper: No paper published for this educational fine-tune
- Demo: No dedicated demo published yet
Uses
Direct Use
This model can be used as a lightweight educational chat model for:
- Explaining Computer Science concepts in simple language
- Explaining basic AI and machine learning ideas
- Helping students understand programming and technical study topics
- Casual English/Arabic conversation with Egyptian student context
- Demonstrating how LoRA fine-tuning works on a small model
Downstream Use
This adapter can be loaded on top of TinyLlama/TinyLlama-1.1B-Chat-v1.0 in educational demos, notebooks, small experiments, and student projects. It can also be used as a starting point for further fine-tuning on larger, higher-quality Arabic or CS-focused instruction datasets.
Out-of-Scope Use
This model should not be used for:
- Legal, medical, financial, safety-critical, or high-stakes decisions
- Production systems that require factual guarantees
- Replacing teachers, domain experts, lawyers, doctors, or professional advisors
- Generating harmful, deceptive, or abusive content
- Claims of broad Arabic mastery or expert-level Egyptian dialect understanding
Bias, Risks, and Limitations
This is a first educational fine-tune trained on a small custom dataset of 41 examples. Because the dataset is small, the model can overfit, hallucinate, repeat patterns, or answer outside its training scope. The base model is also small, so reasoning quality, factual accuracy, Arabic fluency, and instruction following may be limited.
Known limitations:
- Small training dataset
- No published benchmark evaluation
- Not a legal, medical, or professional advice model
- May produce incorrect or outdated information
- May mix English and Arabic unexpectedly
- May reflect biases from the base model and the custom examples
Recommendations
Use this model for learning, demos, and experimentation. Verify important information with reliable sources. For better results, evaluate outputs manually, add a larger dataset, and test with held-out prompts before any serious downstream use.
How to Get Started with the Model
This repository contains a PEFT LoRA adapter. Load it with the base model and tokenizer:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from peft import PeftModel
base_model_id = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
adapter_id = "quantumsquatan/egypt-llm-finetune"
tokenizer = AutoTokenizer.from_pretrained(adapter_id)
base_model = AutoModelForCausalLM.from_pretrained(
base_model_id,
device_map="auto"
)
model = PeftModel.from_pretrained(base_model, adapter_id)
pipe = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer
)
messages = [
{"role": "user", "content": "Explain neural networks simply."}
]
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
print(pipe(prompt, max_new_tokens=200, do_sample=True, temperature=0.7)[0]["generated_text"])
Training Details
Training Data
The model was trained on 41 custom examples focused on:
- Computer Science explanations
- AI and machine learning explanations
- Programming learning support
- Student life in Egypt
- General educational conversation
- English with Egyptian Arabic context
The dataset is not published as a separate Hugging Face dataset card in this repository.
Training Procedure
The model was fine-tuned using supervised fine-tuning with PEFT LoRA adapters on top of TinyLlama. The adapter targets attention projection modules q_proj and v_proj.
Preprocessing
The examples were formatted as chat/instruction-style text for causal language modeling. Tokenization uses the TinyLlama/Llama tokenizer with a maximum model length of 2048 tokens.
Training Hyperparameters
Published adapter configuration:
- Training regime: LoRA supervised fine-tuning for causal language modeling
- PEFT version: 0.19.1
- LoRA rank (
r): 16 - LoRA alpha: 32
- LoRA dropout: 0.05
- Bias: none
- Target modules:
q_proj,v_proj - Task type:
CAUSAL_LM - Base model:
TinyLlama/TinyLlama-1.1B-Chat-v1.0
Other run-specific trainer settings such as exact epoch count, batch size, optimizer, hardware, and wall-clock training time were not published in the model repository.
Speeds, Sizes, Times
- Adapter checkpoint size: about 4.5 MB
- Base model size: inherited from TinyLlama 1.1B
- Training time: not published
Evaluation
Testing Data, Factors, and Metrics
Testing Data
No formal held-out test set was published. The model should be evaluated manually with prompts covering CS, AI, programming, English, Arabic, and mixed English-Arabic use.
Factors
Recommended evaluation factors:
- English CS explanation quality
- Arabic/Egyptian Arabic clarity
- Hallucination rate
- Helpfulness for beginner students
- Instruction following
- Safety and refusal behavior on out-of-scope prompts
Metrics
No benchmark metrics are published. Suggested future metrics include human preference ratings, exactness on small CS QA tests, hallucination checks, and qualitative comparison against the base TinyLlama model.
Results
No formal benchmark results are published yet.
Summary
This model is best understood as a learning milestone and a compact PEFT adapter experiment, not as a production-grade assistant.
Model Examination
No interpretability or internal model examination has been published.
Environmental Impact
Formal carbon accounting was not recorded. Because this is a small LoRA adapter fine-tune on TinyLlama, the compute footprint is expected to be much lower than full-model fine-tuning, but exact emissions cannot be claimed without hardware and runtime logs.
- Hardware type: not published
- Hours used: not published
- Cloud provider: not published
- Compute region: not published
- Carbon emitted: not calculated
Technical Specifications
Model Architecture and Objective
The base architecture is TinyLlama, a Llama-style causal decoder-only language model. This repository provides a LoRA adapter trained for next-token prediction on instruction/chat-style educational examples.
Compute Infrastructure
The exact training infrastructure is not published in the repository.
Hardware
Not published.
Software
Known software stack from repository metadata:
- Transformers
- PEFT 0.19.1
- TRL
- Safetensors
- TinyLlama tokenizer/model family
Citation
No paper has been published for this fine-tune. If you use it, cite the repository:
@misc{quantumsquatan_egypt_llm_finetune_2026,
title = {Egypt LLM Fine-Tune},
author = {quantumsquatan},
year = {2026},
url = {https://huggingface.co/quantumsquatan/egypt-llm-finetune}
}
Glossary
- LoRA: Low-Rank Adaptation, a parameter-efficient fine-tuning method.
- PEFT: Parameter-Efficient Fine-Tuning.
- SFT: Supervised Fine-Tuning.
- Adapter: A small set of trainable weights loaded on top of a base model.
More Information
This model is part of a learning project around Hugging Face, LLM fine-tuning, and AI education in Egypt.
Model Card Authors
quantumsquatan, with README/model-card structuring assistance.
Model Card Contact
Use the Hugging Face model page or GitHub repository issues for contact.
Framework Versions
- PEFT 0.19.1
- Downloads last month
- 26
Model tree for quantumsquatan/egypt-llm-finetune
Base model
TinyLlama/TinyLlama-1.1B-Chat-v1.0