Instructions to use websfactory/Webs-Sejong-31B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use websfactory/Webs-Sejong-31B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="websfactory/Webs-Sejong-31B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("websfactory/Webs-Sejong-31B") model = AutoModelForMultimodalLM.from_pretrained("websfactory/Webs-Sejong-31B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use websfactory/Webs-Sejong-31B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "websfactory/Webs-Sejong-31B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "websfactory/Webs-Sejong-31B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/websfactory/Webs-Sejong-31B
- SGLang
How to use websfactory/Webs-Sejong-31B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "websfactory/Webs-Sejong-31B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "websfactory/Webs-Sejong-31B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "websfactory/Webs-Sejong-31B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "websfactory/Webs-Sejong-31B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use websfactory/Webs-Sejong-31B with Docker Model Runner:
docker model run hf.co/websfactory/Webs-Sejong-31B
Webs-Sejong-31B
🏆 2nd place — NIA K-AI Leaderboard (leaderboard.aihub.or.kr) Average 0.611 across 69 evaluated models, scored on NIA's private Korean benchmark suite (KMMLU-Pro · CLIcK · HLE · MuSR · Com2) on 2026-06-24. (#1 = 0.621.)
Webs-Sejong-31B is a 31B-parameter Korean-centric language model built on the
Gemma-4 architecture through weight-space model merging — it is a merge
model, not a separately trained one, and we state that openly. Starting from
the instruction-tuned Gemma-4-31B foundation, the top-ranked open Korean model
JGOS-31B-Citizen is
integrated via a sign-consensus task-vector merge (DARE-TIES), producing a single
checkpoint with strong Korean cultural and academic competence while remaining
fully compatible with standard transformers and vLLM serving.
The result is produced by weight-space merging only, with no additional training.
Highlights
- Korean-first. Optimized for Korean cultural knowledge and professional / academic reasoning, with English ability retained from the base.
- Independently verified. Ranked 2nd on the NIA K-AI Leaderboard, scored on a private benchmark suite the model never sees.
- Drop-in. Standard Gemma-4 architecture and tokenizer; loads in
transformersand serves in vLLM with no custom code.
Evaluation — NIA K-AI Leaderboard (official)
Official scores from the NIA K-AI Leaderboard, the Korean government's (National Information Society Agency) public model evaluation, scored on 2026-06-24 on a held-out, non-public benchmark suite:
| Benchmark | Webs-Sejong-31B | #1 (JGOS-31B-Citizen) |
|---|---|---|
| KMMLU-Pro (professional / academic) | 0.700 | 0.725 |
| CLIcK (Korean culture / language) | 0.987 | 0.987 |
| HLE (Ko) | 0.062 | 0.061 |
| MuSR (Ko) | 0.570 | 0.591 |
| Com2 (commonsense) | 0.736 | 0.742 |
| Average | 0.611 (2nd) | 0.621 (1st) |
Because Webs-Sejong-31B is a merge of Gemma-4-31B with JGOS-31B-Citizen, its scores closely track — and sit just below — that source model. We report this honestly rather than overstating independent capability.
Merge Method
The checkpoint is produced with a memory-bounded streaming implementation of
DARE-TIES (Drop-And-REscale + TrIm-Elect-Sign), merging
google/gemma-4-31B-it (base) with JGOS-Model/JGOS-31B-Citizen (donor):
- The per-source task vector
τ = JGOS − baseis computed in fp32. - DARE drops 50% of the task vector's deltas (density = 0.5) and rescales
the survivors by
1/density, reducing destructive interference. - TIES sign election resolves parameter-wise sign conflicts before the merged delta is added back to the base at unit weight.
All 1,188 tensors (including the multimodal vision tower) are merged
tensor-by-tensor and stored in bfloat16. No fine-tuning is applied.
| Parameter | Value |
|---|---|
| Base | google/gemma-4-31B-it (dense, multimodal) |
| Donor | JGOS-Model/JGOS-31B-Citizen |
| Method | DARE-TIES (sign-consensus) |
| Density | 0.5 |
| Weight | 1.0 |
| Seed | 42 |
| Precision | bfloat16 |
Usage
from transformers import AutoModelForImageTextToText, AutoProcessor
model_id = "websfactory/Webs-Sejong-31B"
processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForImageTextToText.from_pretrained(model_id, device_map="auto")
Intended Use & Limitations
Intended for Korean-language assistance, knowledge QA, and reasoning. As a merged model it inherits the capabilities and biases of its sources and should be evaluated for your use case before deployment.
License
This model is a derivative of Gemma-4 and incorporates JGOS-31B-Citizen (also a Gemma-4 derivative); it is distributed under the Gemma Terms of Use. By using this model you agree to those terms and Google's Prohibited Use Policy.
- Downloads last month
- 3