Instructions to use Sothay/gemma4-hscode-classifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Local Apps Settings
- Unsloth Studio
How to use Sothay/gemma4-hscode-classifier with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Sothay/gemma4-hscode-classifier to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Sothay/gemma4-hscode-classifier to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Sothay/gemma4-hscode-classifier to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Sothay/gemma4-hscode-classifier", max_seq_length=2048, )
Gemma‑4 HS Code Classifier (Cambodia Customs)
A Gemma‑4‑E4B‑it model fine‑tuned with QLoRA to classify product descriptions into 8‑digit HS codes and return corresponding Cambodian trade rates (Customs Duty, Special Tax, VAT, Excise Tax).
Built with Unsloth for fast, memory‑efficient fine‑tuning on a single T4 GPU.
🎯 What it does
Given a plain‑English product description, the model generates:
HS Code: 61091000
Unit: PIECE
Customs Duty: 25%
Special Tax: 0%
VAT: 10%
Excise Tax: 0%
⚠️ Important: The rates in the text are generated by the model and may be wrong.
For production, always use the included lookup table (hs_code_lookup.json) – see Production use below.
🚀 Quick start (in Colab or locally)
This repository contains only the LoRA adapter, not the full model.
Loading it will automatically download the base model (unsloth/gemma-4-E4B-it) and apply the adapter in 4-bit.
# %% [Install]
%%capture
import os, re
# Install everything needed for the T4 Colab environment
!pip install sentencepiece protobuf "datasets==4.3.0" "huggingface_hub>=0.34.0" hf_transfer
!pip install --no-deps unsloth_zoo bitsandbytes accelerate xformers peft trl triton unsloth
!pip install --no-deps --upgrade "torchao>=0.16.0"
!pip install --no-deps transformers==5.5.0 "tokenizers>=0.22.0,<=0.23.0"
!pip install torchcodec
import torch
torch._dynamo.config.recompile_limit = 64
import warnings
# Suppress the specific PyTorch size check warning from bitsandbytes
warnings.filterwarnings(
"ignore",
category=FutureWarning,
message=".*_check_is_size will be removed in a future PyTorch release.*"
)
#------------
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
"Sothay/gemma4-hscode-classifier", # LoRA adapter on Hugging Face
load_in_4bit = True, # required – the adapter was trained in 4-bit
max_seq_length = 1024,
)
# ---------- Inference with the authoritative lookup table (recommended) ----------
import json, re
with open("hs_code_lookup.json") as f:
rate_lookup = json.load(f)
def predict_hs_code(description: str) -> dict:
system_prompt = (
"You are a customs compliance AI. Classify the product description to its "
"correct 8-digit HS code and output the corresponding trade rates (Customs Duty, "
"Special Tax, VAT, Excise Tax) and unit."
)
messages = [
{"role": "system", "content": [{"type": "text", "text": system_prompt}]},
{"role": "user", "content": [{"type": "text", "text": f"Description: {description}"}]},
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to("cuda")
out = model.generate(inputs, max_new_tokens=80, do_sample=False)
text = tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True)
m = re.search(r"HS Code:\s*([0-9]{4,10})", text)
code = m.group(1) if m else None
if code and code in rate_lookup:
return {"hs_code": code, "source": "lookup_table", **rate_lookup[code]}
return {"hs_code": code, "source": "model_only_UNVERIFIED", "raw_output": text}
print(predict_hs_code("Men's cotton knitted T-shirt"))
🔍 Raw model output (debugging)
If you want to see exactly what the model generated (including the rates it predicted) without the lookup table, use the raw‑output function below.
Do not use these rates in production – they are only for debugging or confidence evaluation.
def predict_hs_code_raw(description: str, max_new_tokens=100) -> dict:
system_prompt = (
"You are a customs compliance AI. Classify the product description to its "
"correct 8-digit HS code and output the corresponding trade rates (Customs Duty, "
"Special Tax, VAT, Excise Tax) and unit."
)
messages = [
{"role": "system", "content": [{"type": "text", "text": system_prompt}]},
{"role": "user", "content": [{"type": "text", "text": f"Description: {description}"}]},
]
inputs = tokenizer.apply_chat_template(
messages, add_generation_prompt=True, tokenize=True,
return_dict=True, return_tensors="pt",
).to("cuda")
out = model.generate(**inputs, max_new_tokens=max_new_tokens, use_cache=True, do_sample=False)
raw_text = tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
def extract(pattern, text):
m = re.search(pattern, text)
return m.group(1).strip() if m else None
return {
"hs_code": extract(r"HS Code:\s*([0-9.]+)", raw_text),
"unit": extract(r"Unit:\s*(.*)", raw_text),
"cd_rate": extract(r"Customs Duty:\s*([\d.]+)%?", raw_text),
"st_rate": extract(r"Special Tax:\s*([\d.]+)%?", raw_text),
"vat_rate": extract(r"VAT:\s*([\d.]+)%?", raw_text),
"et_rate": extract(r"Excise Tax:\s*([\d.]+)%?", raw_text),
"raw_output": raw_text
}
# Example
raw = predict_hs_code_raw("Men's cotton knitted T-shirt")
print(raw["raw_output"])
print(raw["hs_code"]) # model’s guess
🧠 Training details
- Base model:
unsloth/gemma-4-E4B-it(4‑bit QLoRA) - Adapter rank: r=16, alpha=16, targeting all language & attention layers
- Gradient checkpointing: Unsloth’s own implementation (avoids Gemma‑4 KV‑shared layer bug)
- Dataset: Custom Cambodian HS‑code dataset (
hs_code.csv) with descriptions, codes, and official rates- Cleaned, deduplicated, split into 90/10 train/validation
- Chat roles fixed to system/user/assistant (Gemma‑4 standard)
- Training config: 3 epochs, effective batch size 8, learning rate 2e‑4, linear schedule, eval & save every epoch, best model loaded
- Hardware: Google Colab T4 (16 GB) – peak memory ~10 GB thanks to QLoRA
- Accuracy: Evaluated on held‑out examples (exact HS‑code match) – see model card for current numbers
⚖️ Production use
Always use the lookup table – never trust the model’s generated rates.
The model is a classifier: description → HS code.
Rates are fetched deterministically from hs_code_lookup.json, a file extracted from the same official tariff data used during training.
Why?
- A causal LM recalling a rate from memory will occasionally hallucinate – a customs tool with confident, wrong numbers is worse than one that says “I don’t know”.
- The lookup table guarantees 100% accuracy on rates once the HS code is correct.
The hs_code_lookup.json file is included in this repository and can be downloaded via:
from huggingface_hub import hf_hub_download
hf_hub_download("Sothay/gemma4-hscode-classifier", "hs_code_lookup.json")
📦 Files in this repository
| File | Description |
|---|---|
adapter_model.safetensors |
LoRA adapter weights (few MB) |
adapter_config.json |
Adapter configuration (references base model) |
tokenizer.json, tokenizer_config.json |
Tokenizer files |
hs_code_lookup.json |
Authoritative rate table for production inference |
README.md |
This file |
Note: Only the adapter is stored here – the full Gemma‑4 base model is automatically fetched from Unsloth when you call
FastModel.from_pretrained.
If you need a merged, full‑precision model (for vLLM, TGI, etc.), generate it locally with Unsloth:model.save_pretrained_merged("merged_fp16", tokenizer, save_method="merged_16bit")
🦙 Ollama / llama.cpp (GGUF)
Export a quantized GGUF directly from the loaded adapter:
model.save_pretrained_gguf("gguf_model", tokenizer, quantization_method="q4_k_m")
Then use with Ollama (see Modelfile example – set temperature 0, deterministic sampling).
📊 Example predictions
| Description | Predicted HS Code | Unit | CD | ST | VAT | ET |
|---|---|---|---|---|---|---|
| Toyota Hilux pickup, diesel 2.8L | 87042110 | UNIT | 35% | 50% | 10% | 0% |
| iPhone 15 Pro Max 256GB | 85171200 | UNIT | 0% | 0% | 10% | 0% |
| Heineken beer 330ml can | 22030010 | LTR | 35% | 30% | 10% | 0% |
(Rates from lookup table – not generated by the model.)
⚠️ Limitations
- The model may output incorrect HS codes for ambiguous, misspelled, or region‑specific descriptions.
- It was trained on a fixed set of Cambodian HS codes; revisions after the training data cutoff are not covered.
- Duty rates can become outdated – always cross‑check with the latest official tariff schedule.
- The model is a classifier, not a legal authority. For binding decisions, consult a customs professional.
📝 License
This model is a derivative of Gemma‑4‑E4B‑it and is subject to the Gemma license.
The HS‑code dataset and lookup table are the property of their respective owners.
🙏 Acknowledgments
- Unsloth – made QLoRA + Gemma‑4 on a T4 effortless
- Google DeepMind – for the Gemma family of models
📚 Citation
If you use this model, please cite:
@misc{gemma4-hscode-classifier,
author = {Sothay},
title = {Gemma‑4 HS Code Classifier (Cambodia Customs)},
year = 2025,
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Sothay/gemma4-hscode-classifier}}
}
Author: Sothay
Model card version: 1.2