Instructions to use aitf-kpm-ugm/Qwen3-4B-CPT-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use aitf-kpm-ugm/Qwen3-4B-CPT-Base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="aitf-kpm-ugm/Qwen3-4B-CPT-Base")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("aitf-kpm-ugm/Qwen3-4B-CPT-Base")
model = AutoModelForCausalLM.from_pretrained("aitf-kpm-ugm/Qwen3-4B-CPT-Base")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use aitf-kpm-ugm/Qwen3-4B-CPT-Base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "aitf-kpm-ugm/Qwen3-4B-CPT-Base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "aitf-kpm-ugm/Qwen3-4B-CPT-Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/aitf-kpm-ugm/Qwen3-4B-CPT-Base

SGLang

How to use aitf-kpm-ugm/Qwen3-4B-CPT-Base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "aitf-kpm-ugm/Qwen3-4B-CPT-Base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "aitf-kpm-ugm/Qwen3-4B-CPT-Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "aitf-kpm-ugm/Qwen3-4B-CPT-Base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "aitf-kpm-ugm/Qwen3-4B-CPT-Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Unsloth Studio

How to use aitf-kpm-ugm/Qwen3-4B-CPT-Base with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for aitf-kpm-ugm/Qwen3-4B-CPT-Base to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for aitf-kpm-ugm/Qwen3-4B-CPT-Base to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for aitf-kpm-ugm/Qwen3-4B-CPT-Base to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="aitf-kpm-ugm/Qwen3-4B-CPT-Base",
    max_seq_length=2048,
)

Docker Model Runner
How to use aitf-kpm-ugm/Qwen3-4B-CPT-Base with Docker Model Runner:
```
docker model run hf.co/aitf-kpm-ugm/Qwen3-4B-CPT-Base
```

Model Card for Qwen3-4B-CPT-Base

Continued pre-trained (CPT) variant of Qwen3-4B-Base, adapted to Indonesian on ~200M domain tokens. Base model — not instruction-tuned.

Model Details

Model Description

Qwen3-4B-CPT-Base extends Qwen/Qwen3-4B-Base with continued pre-training on a ~200M-token Indonesian corpus (news, Wikipedia, social media). The goal is Indonesian-domain adaptation as the foundation for downstream SFT. It is a base model: it performs raw text completion and is not tuned for instruction-following or chat. Part of the Model Narasi Isu pipeline (CPT -> SFT -> Deployment) for Indonesian public-issue monitoring and narrative analysis.

Developed by: AITF UGM 2026
Model type: Causal decoder-only LLM (continued pre-training)
Language(s) (NLP): Indonesian (Bahasa Indonesia); English technical terms preserved
License: Qwen License
Finetuned from model [optional]: Qwen/Qwen3-4B-Base

Model Sources [optional]

Repository: https://huggingface.co/aitf-ugm-2026

Uses

Direct Use

Indonesian-domain text completion. Perplexity benchmarking against vanilla Qwen3 baselines.

Downstream Use [optional]

Foundation for supervised fine-tuning (SFT) on Indonesian tasks: summarization, issue narrative analysis (ABSA), dashboard previews, chatbot Q&A.

Out-of-Scope Use

Not for chat or instruction-following before SFT. Not for high-stakes decisions without human review. Not a safety-aligned assistant.

Bias, Risks, and Limitations

Not instruction-tuned: no reliable JSON, chat, or task behavior. Corpus is news-heavy (70%), so outputs may reflect media and social-media biases. Coverage skews to topics present in the corpus window.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. Validate outputs; apply SFT before task deployment.

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "aitf-ugm-2026/Qwen3-4B-CPT-Base"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, torch_dtype=torch.bfloat16, device_map="auto"
)

prompt = "Ibu kota Indonesia adalah"
ids = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**ids, max_new_tokens=64)
print(tok.decode(out[0], skip_special_tokens=True))

vLLM (use completions endpoint, not chat):

vllm serve aitf-ugm-2026/Qwen3-4B-CPT-Base \
  --gpu-memory-utilization 0.90 --max-model-len 8192

Training Details

Training Data

~200M tokens, group-aware split (train/val/test = 0.99 / 0.005 / 0.005).

Source	Share	Tokens
Berita (news)	70%	~140M
Wikipedia (id)	20%	~40M
Sosial media	10%	~20M
Total	100%	~200M

Train split: 325,860 records / ~198M tokens. Test: 1,655 records (news 1,098 / socmed 191 / wiki 366).

Training Procedure

Preprocessing [optional]

Group-aware train/val/test split to avoid leakage. Sequence packing enabled. Local /content/ processing before Drive copy.

Training Hyperparameters

Training regime: bf16 mixed precision
Method: LoRA, RSLoRA enabled
LoRA rank / alpha: 128 / 256
Extra modules: embed_tokens, lm_head included
LoRA dropout: 0.0
Max seq length: 8192
Packing: True; 4-bit load: False
Epochs: 1
Per-device batch: 12; grad accumulation: 16; effective batch: 192
Learning rate: 1e-5; embedding LR: 5e-6
Scheduler: cosine; warmup ratio: 0.03
Optimizer: adamw_8bit; weight decay: 0.01
Seed: 3407; early stopping enabled
Save format: merged_16bit

Evaluation

Testing Data, Factors & Metrics

Testing Data

Held-out test set: 1,654 documents (news / socmed / wiki).

Factors

Disaggregated by source domain: news, social media, Wikipedia.

Metrics

Perplexity (lower is better). Eval: ~1M tokens, max_length=4096, stride=1024, bf16 / 4-bit.

Results

Model	Full	News	Socmed	Wiki
Qwen3-4B-CPT-Base (this)	4.561	4.108	4.418	6.492
Qwen3-4B-Base (vanilla)	5.930	5.389	6.438	7.757
Improvement	~23%	~24%	~31%	~16%

Summary

CPT cuts perplexity ~~23% overall vs vanilla Qwen3-4B-Base, and beats vanilla Qwen3-8B-Base on all four subsets. Domain adaptation outweighs raw parameter count for this Indonesian domain. Largest gain on social media (~~31%).

Technical Specifications [optional]

Model Architecture and Objective

Qwen3 causal decoder-only transformer. Objective: continued causal language-model pre-training (next-token prediction).

Compute Infrastructure

Hardware

NVIDIA A100 80GB (Google Colab Pro+).

Software

Unsloth, TRL, HuggingFace Transformers, PEFT, bitsandbytes. Monitoring: WandB.

Citation [optional]

BibTeX:

@misc{qwen3_4b_cpt_base,
  title  = {Qwen3-4B-CPT-Base: Indonesian Continued Pre-Training},
  author = {AITF UGM 2026},
  year   = {2026},
  note   = {Model Narasi Isu pipeline}
}

APA:

AITF UGM 2026. (2026). Qwen3-4B-CPT-Base: Indonesian Continued Pre-Training. Model Narasi Isu pipeline.

More Information

Model Narasi Isu: Indonesian public-issue monitoring and narrative analysis pipeline.

Model Card Authors

AITF UGM 2026

Model Card Contact

https://huggingface.co/aitf-ugm-2026

Downloads last month: 272

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for aitf-kpm-ugm/Qwen3-4B-CPT-Base

Base model

Qwen/Qwen3-4B-Base

Finetuned

(290)

this model

Adapters

2 models