Instructions to use Eldenary/qwen-Customer-Service-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Eldenary/qwen-Customer-Service-lora with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-1.7B")
model = PeftModel.from_pretrained(base_model, "Eldenary/qwen-Customer-Service-lora")

Transformers

How to use Eldenary/qwen-Customer-Service-lora with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Eldenary/qwen-Customer-Service-lora")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("Eldenary/qwen-Customer-Service-lora", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Eldenary/qwen-Customer-Service-lora with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Eldenary/qwen-Customer-Service-lora"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Eldenary/qwen-Customer-Service-lora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Eldenary/qwen-Customer-Service-lora

SGLang

How to use Eldenary/qwen-Customer-Service-lora with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Eldenary/qwen-Customer-Service-lora" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Eldenary/qwen-Customer-Service-lora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Eldenary/qwen-Customer-Service-lora" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Eldenary/qwen-Customer-Service-lora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Eldenary/qwen-Customer-Service-lora with Docker Model Runner:
```
docker model run hf.co/Eldenary/qwen-Customer-Service-lora
```

Qwen3-1.7B Egyptian Arabic Customer Service (LoRA)

A LoRA adapter fine-tuned on top of Qwen/Qwen3-1.7B for Egyptian Arabic customer service conversations. The model handles real-world customer inquiries in colloquial Egyptian Arabic — order tracking, delivery issues, product questions, and complaint resolution.

Model Details

Model Description

This adapter was trained on a custom dataset of 257 multi-turn Egyptian Arabic customer service conversations using Low-Rank Adaptation (LoRA). Only the lightweight adapter weights are stored here; the base Qwen3-1.7B weights remain unchanged and are loaded separately at inference time.

Developed by: Youssef Eldenary
Model type: Causal Language Model — LoRA adapter (PEFT) over Qwen3-1.7B
Language(s) (NLP): Arabic — Egyptian dialect (عامية مصرية)
License: MIT
Finetuned from model: Qwen/Qwen3-1.7B

Model Sources

Repository: https://github.com/Eldenary/qwen-Customer-Service-lora

Direct Use

This model is intended for Egyptian Arabic customer service chatbots. Load the adapter on top of the Qwen3-1.7B base model and query it directly with customer messages in Egyptian Arabic dialect. Example scenarios:

Order tracking: "الأوردر لسه موصلش" → the model asks for the order number and reassures the customer
Delivery issues: "الشحنة اتأخرت" → the model acknowledges and explains next steps
Returns & refunds: "عايز أرجع المنتج" → the model walks through the return process
General complaints: "السلام عليكم، عندي مشكلة في الأوردر" → the model opens a support dialogue

Downstream Use

The adapter can be merged into the base model and integrated into a larger customer support pipeline, chatbot backend (e.g., FastAPI + React), or voice assistant targeting Egyptian Arabic-speaking users. It is well-suited as the language generation component in a retrieval-augmented or tool-calling customer service system.

Out-of-Scope Use

Other Arabic dialects: The model was trained exclusively on Egyptian Arabic and will likely underperform on Levantine, Gulf, Moroccan, or MSA (Modern Standard Arabic) inputs.
General-purpose assistant: This is not a general assistant. Performance on topics outside customer service (e.g., coding, science, creative writing) will be limited.
High-stakes decisions: Should not be used for automated decisions involving refunds, account actions, or policy enforcement without human review.
Medical, legal, or financial advice: Not appropriate for any of these domains.

Bias, Risks, and Limitations

Dialect bias: Egyptian Arabic only. Other dialects are out of distribution.
Domain bias: Trained on e-commerce/order-management scenarios. Responses to unrelated queries may be irrelevant or generic.
Small dataset: 257 conversations is a compact training set. The model may over-fit to certain phrasing patterns or fail on edge cases not seen during training.
Hallucination: Like all LLMs, the model can produce fluent but incorrect responses. It has no access to live order data and should always be paired with a backend data source for factual order information.
No content filtering: The adapter does not include a safety classifier. Downstream deployments should add moderation where appropriate.

Recommendations

Always connect the model to a live order management system rather than relying on it for factual order status.
Implement a fallback to a human agent for complaints the model expresses uncertainty about.
Monitor outputs regularly for quality and tone drift, especially after large volumes of user interactions.
Inform users they are interacting with an AI assistant.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = "Qwen/Qwen3-1.7B"
adapter    = "Eldenary/qwen-Customer-Service-lora"

# Load base model + adapter
model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto", torch_dtype=torch.float16)
model = PeftModel.from_pretrained(model, adapter)
model.eval()

tokenizer = AutoTokenizer.from_pretrained(adapter)

# Run a customer query in Egyptian Arabic
prompt = "الأوردر لسه موصلش."

messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

A custom dataset of 257 multi-turn customer service conversations in colloquial Egyptian Arabic, formatted as Qwen3 chat templates:

{
  "conversations": [
    { "role": "user",      "content": "السلام عليكم، عندي مشكلة في الأوردر." },
    { "role": "assistant", "content": "وعليكم السلام، تحت أمرك. ممكن تقولي رقم الأوردر؟" }
  ]
}

The dataset covers order tracking, delivery delays, product returns, refund requests, and general customer complaints — all in Egyptian Arabic dialect. No external dataset was used; the data was collected and curated manually.

Training Procedure

Preprocessing

Each conversation was passed through tokenizer.apply_chat_template() to produce the Qwen3 chat format string, then tokenized with truncation and padding to a maximum sequence length of 512 tokens. Labels were set equal to input_ids for standard causal language modeling (next-token prediction over the full sequence).

Training Hyperparameters

Training regime: fp16 mixed precision

Parameter	Value
Epochs	5
Per-device batch size	2
Gradient accumulation steps	2 (effective batch size = 4)
Learning rate	2e-4
Max sequence length	512 tokens
Optimizer	AdamW (Trainer default)
Checkpointing	Every 50 steps, last 2 kept

LoRA configuration:

Parameter	Value
Rank (`r`)	8
Alpha	16
Dropout	0.05
Bias	none
Task type	CAUSAL_LM

Evaluation

Factors

Evaluation focused on:

In-domain queries: Order tracking, delivery issues, returns — topics covered in training data
Edge cases: Ambiguous or multi-step complaints requiring context from earlier turns

Metrics

Formal automated metrics (BLEU, ROUGE, perplexity) were not computed. Evaluation was qualitative, assessing:

Dialect naturalness — does the response sound like authentic Egyptian Arabic?
Relevance — does the response address the customer's actual issue?
Tone — is the response polite, professional, and helpful?
Context retention — does the model correctly refer to information from earlier turns?

Results

The model produces fluent, natural Egyptian Arabic responses to customer service queries within the training domain. It handles multi-turn context and resolves references to previously mentioned orders or issues. Performance drops noticeably on queries outside the customer service domain or in non-Egyptian Arabic dialects.

Summary

A lightweight LoRA adapter that adds Egyptian Arabic customer service capability to Qwen3-1.7B with minimal compute. Suitable for deployment in chatbot pipelines targeting Egyptian Arabic-speaking customers, with the caveat that it should always be paired with live data sources for order information.

Model Architecture and Objective

Architecture: Decoder-only Transformer (Qwen3-1.7B)
Objective: Causal language modeling — next-token prediction over Qwen3 chat-formatted sequences
Adapter method: LoRA via PEFT — injects trainable low-rank matrices into the attention layers; base model weights are frozen during training

Software

Python 3.10+
PyTorch 2.x
HuggingFace Transformers
PEFT 0.19.1
HuggingFace Datasets
Accelerate

Citation

BibTeX:

@misc{eldenary2025qwencustomerservice,
  author       = {Eldenary},
  title        = {Qwen3-1.7B Egyptian Arabic Customer Service LoRA},
  year         = {2026},
  publisher    = {HuggingFace},
  howpublished = {\url{https://huggingface.co/Eldenary/qwen-Customer-Service-lora}}
}

APA:

Eldenary. (2025). Qwen3-1.7B Egyptian Arabic Customer Service LoRA [Model]. HuggingFace. https://huggingface.co/Eldenary/qwen-Customer-Service-lora

Glossary

LoRA (Low-Rank Adaptation): A parameter-efficient fine-tuning method that injects small trainable matrices into a frozen pre-trained model, dramatically reducing the number of trainable parameters.
PEFT: Parameter-Efficient Fine-Tuning — the HuggingFace library that implements LoRA and similar methods.
Egyptian Arabic (عامية مصرية): The colloquial spoken dialect of Arabic used in Egypt, distinct from Modern Standard Arabic (MSA) and other regional dialects.
Causal LM: A language model trained to predict the next token given all previous tokens — the standard objective for GPT-style models.
Chat template: A structured format that wraps conversation turns (user/assistant roles) into a single string the model can process.

More Information

Base model: Qwen/Qwen3-1.7B
Training code and dataset: GitHub repository
PEFT documentation: https://huggingface.co/docs/peft

Model Card Contact

Open an issue on the GitHub repository for questions, feedback, or collaboration.

Framework versions

PEFT 0.19.1

Downloads last month: 34

Model tree for Eldenary/qwen-Customer-Service-lora

Base model

Qwen/Qwen3-1.7B-Base

Finetuned

Qwen/Qwen3-1.7B

Adapter

(545)

this model