Instructions to use websfactory/Webs-Sejong-31B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use websfactory/Webs-Sejong-31B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="websfactory/Webs-Sejong-31B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("websfactory/Webs-Sejong-31B")
model = AutoModelForMultimodalLM.from_pretrained("websfactory/Webs-Sejong-31B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use websfactory/Webs-Sejong-31B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "websfactory/Webs-Sejong-31B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "websfactory/Webs-Sejong-31B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/websfactory/Webs-Sejong-31B

SGLang

How to use websfactory/Webs-Sejong-31B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "websfactory/Webs-Sejong-31B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "websfactory/Webs-Sejong-31B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "websfactory/Webs-Sejong-31B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "websfactory/Webs-Sejong-31B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use websfactory/Webs-Sejong-31B with Docker Model Runner:
```
docker model run hf.co/websfactory/Webs-Sejong-31B
```

Webs-Sejong-31B

🏆 2nd place — NIA K-AI Leaderboard (leaderboard.aihub.or.kr) Average 0.611 across 69 evaluated models, scored on NIA's private Korean benchmark suite (KMMLU-Pro · CLIcK · HLE · MuSR · Com2) on 2026-06-24. (#1 = 0.621.)

Webs-Sejong-31B is a 31B-parameter Korean-centric language model built on the Gemma-4 architecture through weight-space model merging — it is a merge model, not a separately trained one, and we state that openly. Starting from the instruction-tuned Gemma-4-31B foundation, the top-ranked open Korean model JGOS-31B-Citizen is integrated via a sign-consensus task-vector merge (DARE-TIES), producing a single checkpoint with strong Korean cultural and academic competence while remaining fully compatible with standard transformers and vLLM serving.

The result is produced by weight-space merging only, with no additional training.

Highlights

Korean-first. Optimized for Korean cultural knowledge and professional / academic reasoning, with English ability retained from the base.
Independently verified. Ranked 2nd on the NIA K-AI Leaderboard, scored on a private benchmark suite the model never sees.
Drop-in. Standard Gemma-4 architecture and tokenizer; loads in transformers and serves in vLLM with no custom code.

Evaluation — NIA K-AI Leaderboard (official)

Official scores from the NIA K-AI Leaderboard, the Korean government's (National Information Society Agency) public model evaluation, scored on 2026-06-24 on a held-out, non-public benchmark suite:

Benchmark	Webs-Sejong-31B	#1 (JGOS-31B-Citizen)
KMMLU-Pro (professional / academic)	0.700	0.725
CLIcK (Korean culture / language)	0.987	0.987
HLE (Ko)	0.062	0.061
MuSR (Ko)	0.570	0.591
Com2 (commonsense)	0.736	0.742
Average	0.611 (2nd)	0.621 (1st)

Because Webs-Sejong-31B is a merge of Gemma-4-31B with JGOS-31B-Citizen, its scores closely track — and sit just below — that source model. We report this honestly rather than overstating independent capability.

Merge Method

The checkpoint is produced with a memory-bounded streaming implementation of DARE-TIES (Drop-And-REscale + TrIm-Elect-Sign), merging google/gemma-4-31B-it (base) with JGOS-Model/JGOS-31B-Citizen (donor):

The per-source task vector τ = JGOS − base is computed in fp32.
DARE drops 50% of the task vector's deltas (density = 0.5) and rescales the survivors by 1/density, reducing destructive interference.
TIES sign election resolves parameter-wise sign conflicts before the merged delta is added back to the base at unit weight.

All 1,188 tensors (including the multimodal vision tower) are merged tensor-by-tensor and stored in bfloat16. No fine-tuning is applied.

Parameter	Value
Base	google/gemma-4-31B-it (dense, multimodal)
Donor	JGOS-Model/JGOS-31B-Citizen
Method	DARE-TIES (sign-consensus)
Density	0.5
Weight	1.0
Seed	42
Precision	bfloat16

Usage

from transformers import AutoModelForImageTextToText, AutoProcessor

model_id = "websfactory/Webs-Sejong-31B"
processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForImageTextToText.from_pretrained(model_id, device_map="auto")

Intended Use & Limitations

Intended for Korean-language assistance, knowledge QA, and reasoning. As a merged model it inherits the capabilities and biases of its sources and should be evaluated for your use case before deployment.

License

This model is a derivative of Gemma-4 and incorporates JGOS-31B-Citizen (also a Gemma-4 derivative); it is distributed under the Gemma Terms of Use. By using this model you agree to those terms and Google's Prohibited Use Policy.

Downloads last month: 3

Safetensors

Model size

31B params

Tensor type

BF16

Model tree for websfactory/Webs-Sejong-31B

JGOS-Model/JGOS-31B-Citizen

google/gemma-4-31B-it

Merge model

this model