Instructions to use vete-speis/gemma-4-E2B-it-phishing with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use vete-speis/gemma-4-E2B-it-phishing with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("google/gemma-4-E2B-it") model = PeftModel.from_pretrained(base_model, "vete-speis/gemma-4-E2B-it-phishing") - llama-cpp-python
How to use vete-speis/gemma-4-E2B-it-phishing with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="vete-speis/gemma-4-E2B-it-phishing", filename="phishing-gemma4-e2b-Q4_K_M.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use vete-speis/gemma-4-E2B-it-phishing with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf vete-speis/gemma-4-E2B-it-phishing:Q4_K_M # Run inference directly in the terminal: llama-cli -hf vete-speis/gemma-4-E2B-it-phishing:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf vete-speis/gemma-4-E2B-it-phishing:Q4_K_M # Run inference directly in the terminal: llama-cli -hf vete-speis/gemma-4-E2B-it-phishing:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf vete-speis/gemma-4-E2B-it-phishing:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf vete-speis/gemma-4-E2B-it-phishing:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf vete-speis/gemma-4-E2B-it-phishing:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf vete-speis/gemma-4-E2B-it-phishing:Q4_K_M
Use Docker
docker model run hf.co/vete-speis/gemma-4-E2B-it-phishing:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use vete-speis/gemma-4-E2B-it-phishing with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "vete-speis/gemma-4-E2B-it-phishing" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "vete-speis/gemma-4-E2B-it-phishing", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/vete-speis/gemma-4-E2B-it-phishing:Q4_K_M
- Ollama
How to use vete-speis/gemma-4-E2B-it-phishing with Ollama:
ollama run hf.co/vete-speis/gemma-4-E2B-it-phishing:Q4_K_M
- Unsloth Studio
How to use vete-speis/gemma-4-E2B-it-phishing with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for vete-speis/gemma-4-E2B-it-phishing to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for vete-speis/gemma-4-E2B-it-phishing to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for vete-speis/gemma-4-E2B-it-phishing to start chatting
- Pi
How to use vete-speis/gemma-4-E2B-it-phishing with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf vete-speis/gemma-4-E2B-it-phishing:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "vete-speis/gemma-4-E2B-it-phishing:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use vete-speis/gemma-4-E2B-it-phishing with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf vete-speis/gemma-4-E2B-it-phishing:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default vete-speis/gemma-4-E2B-it-phishing:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use vete-speis/gemma-4-E2B-it-phishing with Docker Model Runner:
docker model run hf.co/vete-speis/gemma-4-E2B-it-phishing:Q4_K_M
- Lemonade
How to use vete-speis/gemma-4-E2B-it-phishing with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull vete-speis/gemma-4-E2B-it-phishing:Q4_K_M
Run and chat with the model
lemonade run user.gemma-4-E2B-it-phishing-Q4_K_M
List all available models
lemonade list
Gemma-4-E2B-it — Phishing Email Classifier (LoRA)
LoRA fine-tune of google/gemma-4-E2B-it that classifies emails as phishing or legit with a single-word answer.
This repo contains:
- the LoRA adapter (PEFT, ~50 MB) — reproducible, composable
- ready-to-run GGUF quants (merged model) for llama.cpp / LM Studio:
Q4_K_M(3.4 GB) andQ8_0(5 GB)
Results
Evaluated on 1,000 held-out emails (balanced, English):
| Model | Accuracy | Precision | Recall | F1 |
|---|---|---|---|---|
| gemma-4-E2B-it zero-shot | 85.7% | 90.3% | 78.9% | 0.842 |
| + this LoRA | 95.6% | 98.4% | 98.0% | 0.982 |
Notes: 26/1000 outputs were off-format and counted as errors (accuracy on parsable outputs: 98.2%). Zero-shot baseline measured on a 300-email subset.
Usage
The model expects this exact system prompt (it was trained with it):
You are an email security classifier. Classify the email as 'phishing' or 'legit'. Respond with one word only.
LM Studio / llama.cpp (GGUF)
Download phishing-gemma4-e2b-Q4_K_M.gguf, set the system prompt above, temperature 0, paste an email as the user message. Requires a llama.cpp build recent enough for the gemma4 architecture.
Transformers + PEFT (adapter)
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
BASE = "google/gemma-4-E2B-it"
tok = AutoTokenizer.from_pretrained(BASE)
model = AutoModelForCausalLM.from_pretrained(BASE, dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(model, "vete-speis/gemma-4-E2B-it-phishing")
msgs = [
{"role": "system", "content": "You are an email security classifier. Classify the email as 'phishing' or 'legit'. Respond with one word only."},
{"role": "user", "content": "Your account has been suspended. Verify now: http://secure-login.example.xyz"},
]
ids = tok.apply_chat_template(msgs, add_generation_prompt=True, return_tensors="pt").to(model.device)
out = model.generate(ids, max_new_tokens=5, do_sample=False)
print(tok.decode(out[0, ids.shape[1]:], skip_special_tokens=True)) # -> phishing
Training
- Base: google/gemma-4-E2B-it (2.3B effective params)
- Data: zefang-liu/phishing-email-dataset, cleaned, class-balanced 50/50, 10.5k train / 500 val / 1,000 test, emails truncated to 1,500 chars
- Method: SFT with TRL
SFTTrainer+ PEFT LoRA — r=16, alpha=32, dropout 0.05, targets: the innernn.Linearof q/k/v/o projections (Gemma 4 wraps them inGemma4ClippableLinear, which PEFT cannot wrap directly) - Run: 1 epoch, effective batch 16 (4×4), seq len 512, lr 2e-4 cosine, bf16, gradient checkpointing — ~15 min on a single H100 (Modal)
Limitations
- Trained on English emails. It generalizes surprisingly well to other languages (the base is multilingual), but no formal eval outside English.
- The dataset's "phishing" class includes generic spam, so the model leans toward flagging unsolicited marketing/cold-outreach emails as
phishing. If you need a strict spam ≠ phishing distinction, you need 3-class data. - ~2.6% of outputs may be off-format; parse with a contains-check on
phishing/legitand treat anything else as abstention. - Not a substitute for a mail security gateway. Use as a triage aid.
License
Gemma derivatives are governed by the Gemma Terms of Use. The adapter and GGUF files here are derivatives of google/gemma-4-E2B-it.
- Downloads last month
- 49
4-bit
8-bit