Instructions to use mahmut2142/alimzeka-gemma-reasoning with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use mahmut2142/alimzeka-gemma-reasoning with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="mahmut2142/alimzeka-gemma-reasoning")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("mahmut2142/alimzeka-gemma-reasoning")
model = AutoModelForImageTextToText.from_pretrained("mahmut2142/alimzeka-gemma-reasoning")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use mahmut2142/alimzeka-gemma-reasoning with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "mahmut2142/alimzeka-gemma-reasoning"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mahmut2142/alimzeka-gemma-reasoning",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/mahmut2142/alimzeka-gemma-reasoning

SGLang

How to use mahmut2142/alimzeka-gemma-reasoning with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "mahmut2142/alimzeka-gemma-reasoning" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mahmut2142/alimzeka-gemma-reasoning",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "mahmut2142/alimzeka-gemma-reasoning" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mahmut2142/alimzeka-gemma-reasoning",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use mahmut2142/alimzeka-gemma-reasoning with Docker Model Runner:
```
docker model run hf.co/mahmut2142/alimzeka-gemma-reasoning
```

🏛️ AlimZeka-Gemma (Deep Reasoning)

AlimZeka is a "Deep Reasoning" model developed based on the Gemma-4-E2B architecture, focusing on Islamic sciences, jurisprudential reasoning, and ethical values. This project, carried out under the AlimZeka initiative, is designed as an assistant capable of answering complex religious, social, and moral questions through a logical chain of reasoning.

🚀 Key Features

Deep Reasoning: Before generating answers, the model runs an internal thinking process (<|think|>) to reach logical conclusions step by step.
Text-Focused Optimization: Instead of visual capabilities, the model focuses on text-based analysis, conceptual relationships, and moral reasoning capacity.
Ethical and Moral Alignment: It adopts a tone that prioritizes academic integrity, Islamic ethical principles, and general ethical standards.
Lightweight yet Powerful: Thanks to the E2B (Edge) architecture, it delivers high reasoning performance with low system resource usage (on mobile and desktop devices).

📊 Data Methodology (Data Engineering)

AlimZeka’s training infrastructure is based on modern data engineering techniques and transparency principles:

Synthetic Data Generation: The model’s reasoning ability has been enhanced with specially generated synthetic datasets created through a one-time API call mechanism consisting of 3000 high-fidelity samples.
Why Synthetic Data? This method is preferred to ensure full compliance with jurisprudential and academic standards, maintain strict control over data quality, and adhere to privacy principles (Privacy-by-Design).

💻 Usage

You can use the model via the transformers library as follows:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "mahmut2142/alimzeka-gemma-reasoning" 
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

# Example Usage Flow
import torch
from transformers import AutoModelForCausalLM, AutoProcessor, TextIteratorStreamer
import threading, os
os.environ["PYTORCH_ALLOC_CONF"] = "expandable_segments:True"

model_id = "................................."

SYSTEM_PROMPT = (
    "You are ALIMZEKA — an academic research assistant specialized in Islamic sciences "
    "and general knowledge. Your task is to provide accurate, source-based, and professional "
    "responses in accordance with the principles of the Quran, Sunnah, and Ijma. "
    "If a topic is unknown or disputed, you clearly state this; you do not produce "
    "unsupported, unsourced, or unverified Arabic text. At the end of every answer, "
    "you include a Confidence Score (High/Medium/Low) and conclude your statements "
    "with 'Allah knows best.' Before answering, you must always analyze complex issues "
    "step by step within a chain-of-thought (<thought>...</thought>). "
    "STRICT RULE: After completing the reasoning process, you MUST close the </thought> tag "
    "and then provide the user with a detailed final answer! It is FORBIDDEN to respond "
    "with only the reasoning process."
)

print("\n==================================")
print("   ALIMZEKA Terminal Chat Test")
print(f"   Model: {model_id}")
print("==================================\n")

print("⏳ Loading processor...")
processor = AutoProcessor.from_pretrained(model_id)
if processor.tokenizer.pad_token is None:
    processor.tokenizer.pad_token = processor.tokenizer.eos_token

print("⏳ Loading model (bfloat16 + 4-bit quantization)...")
from transformers import BitsAndBytesConfig
bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    quantization_config=bnb,
    torch_dtype=torch.bfloat16,
)
model.eval()

print("\n✅ System Ready. Type 'exit' or 'quit' to leave.\n")

while True:
    try:
        user_input = input("\nYou: ")
        if user_input.lower() in ['exit', 'quit', 'çıkış']:
            break
        if not user_input.strip():
            continue

        messages = [
            {"role": "system", "content": [{"type": "text", "text": SYSTEM_PROMPT}]},
            {"role": "user",   "content": [{"type": "text", "text": user_input}]},
        ]

        inputs = processor.apply_chat_template(
            messages,
            add_generation_prompt=True,
            tokenize=True,
            return_dict=True,
            return_tensors="pt",
        ).to(model.device)

        for k, v in inputs.items():
            if torch.is_floating_point(v):
                inputs[k] = v.to(model.dtype)

        streamer = TextIteratorStreamer(processor.tokenizer, skip_prompt=True, skip_special_tokens=True)
        t = threading.Thread(
            target=model.generate,
            kwargs=dict(**inputs, streamer=streamer, max_new_tokens=3000,
                        do_sample=True, temperature=0.6, top_p=0.9),
        )
        t.start()

        print("\nAlimZeka: ", end="", flush=True)
        for chunk in streamer:
            print(chunk, end="", flush=True)
        t.join()
        print()

    except KeyboardInterrupt:
        break

print("\nExited.")

⚖️ Legal Notice and Disclaimer
AlimZeka is a technology development and academic research project. The responses generated by the model are analytical and supportive in nature; they should **not** be considered as definitive religious rulings (fatwas), legal judgments, or formal advice. Final decisions should always rely on qualified authorities and official sources.

🛠️ Developer Information

Lead Developer:Mahmut ERDEM

Architecture: Google DeepMind Gemma 4

License: Apache License 2.0

Downloads last month: 16

Safetensors

Model size

5B params

Tensor type

BF16

Model tree for mahmut2142/alimzeka-gemma-reasoning

Base model

google/gemma-4-E2B

Finetuned

google/gemma-4-E2B-it

Finetuned

(205)

this model

mahmut2142
/

alimzeka-gemma-reasoning

🏛️ AlimZeka-Gemma (Deep Reasoning)

🚀 Key Features

📊 Data Methodology (Data Engineering)

💻 Usage

Model tree for mahmut2142/alimzeka-gemma-reasoning

Space using mahmut2142/alimzeka-gemma-reasoning 1