Instructions to use flavianv/qwen4b-apparel23-bundle-sft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use flavianv/qwen4b-apparel23-bundle-sft with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="flavianv/qwen4b-apparel23-bundle-sft")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("flavianv/qwen4b-apparel23-bundle-sft")
model = AutoModelForMultimodalLM.from_pretrained("flavianv/qwen4b-apparel23-bundle-sft")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use flavianv/qwen4b-apparel23-bundle-sft with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "flavianv/qwen4b-apparel23-bundle-sft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "flavianv/qwen4b-apparel23-bundle-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/flavianv/qwen4b-apparel23-bundle-sft

SGLang

How to use flavianv/qwen4b-apparel23-bundle-sft with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "flavianv/qwen4b-apparel23-bundle-sft" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "flavianv/qwen4b-apparel23-bundle-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "flavianv/qwen4b-apparel23-bundle-sft" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "flavianv/qwen4b-apparel23-bundle-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use flavianv/qwen4b-apparel23-bundle-sft with Docker Model Runner:
```
docker model run hf.co/flavianv/qwen4b-apparel23-bundle-sft
```

Qwen3-4B Apparel23 Bundle SFT

Fine-tuned Qwen3-4B-Instruct-2507 for outfit bundle generation: given a natural-language outfit request, return a structured list of apparel items with outfit roles and catalog-style product titles.

Model description

This checkpoint was trained with supervised fine-tuning (full weights, not LoRA) on 20,000 kept Apparel23 outfit bundles. Each training example maps a user query to a compact bundle of real product titles (title-only, no ASINs in the target).

Typical use: shopping assistants, outfit planners, or retrieval pipelines that need structured bundle output before product lookup.

Limitations:

Predictions are often category-plausible but not exact vs gold catalog items (10% exact bundle match on a 10-sample eval).
Performance drops when explicit item hints are removed from the query.
Trained on English apparel queries from the Apparel23 / Qwen-32B labeling pipeline.

Intended use

System prompt (training default)

You are an outfit bundle assistant for apparel shopping. Given a natural-language outfit request, return the matching bundle as compact product evidence for each selected item. Include the outfit role and product title for every item in the outfit.

Output format

### Item 1: dress
Mikarose Chloe Modest Chiffon Maxi Dress or Modest Bridesmaid Dress

### Item 2: footwear
Clarks Women's Danelly Sky Loafer

Supported roles: top, bottom, dress, outer_layer, footwear, accessory.

Quick start

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "flavianv/qwen4b-apparel23-bundle-sft"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

system = (
    "You are an outfit bundle assistant for apparel shopping. "
    "Given a natural-language outfit request, return the matching bundle as "
    "compact product evidence for each selected item. Include the outfit role "
    "and product title for every item in the outfit."
)
messages = [
    {"role": "system", "content": system},
    {"role": "user", "content": "Casual summer outfit for women: denim shorts and ballet flats"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([prompt], return_tensors="pt").to(model.device)

with torch.inference_mode():
    out = model.generate(**inputs, max_new_tokens=384, do_sample=False)

print(tokenizer.decode(out[0, inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Training data

Split	Rows
Train	20,000
Test (held out)	13,705

Source: 33,705 kept outfits from the Qwen-32B Apparel23 labeling pipeline (train/test split, seed=42).
Dataset: flavianv/apparel23-qwen32b-kept-outfits-with-products
SFT file: apparel23_bundle_sft.train.jsonl

Training procedure

Setting	Value
Base model	`Qwen/Qwen3-4B-Instruct-2507`
Method	Full SFT
Epochs	1 (2,500 steps)
Learning rate	2e-5
Max length	768
Batch size	1 × grad accum 8 (effective 8)
Loss	Assistant-only
Seed	42
Hardware	NVIDIA B200 MIG `4g.90gb`
Run ID	`qwen4b_apparel23_bundle_sft_20260616_142947`

Training metadata is included in bundle_sft_metadata.json in this repo.

Evaluation

Greedy decoding (do_sample=False, max_new_tokens=384) unless noted.

Task metrics (perplexity)

Split	Perplexity	Mean token entropy*
Train (20k)	3.12	1.19
Test (13.7k)	3.46	1.25

*Entropy computed on a 256-row subsample per split (assistant tokens).

Generalization probes (post-SFT)

Probe	Score
Easy math	90% (9/10)
Collapse suite	87.5% (7/8)
Combined	88.75

Zero-shot baseline (same 10 samples, seed=42)

Compared against Qwen/Qwen3-4B-Instruct-2507 with the same system prompt:

Metric	Zero-shot	This model
Bundle format compliance	0/10	10/10
Item count matches gold	0/10	10/10
Exact bundle match	0/10	1/10
Mean title recall	0.0	0.10

Zero-shot produces generic prose titles; this model learns the structured bundle schema and catalog-title style.

Example

Query: Casual summer outfit for women: denim shorts and ballet flats

Output (exact match on eval sample):

### Item 1: bottom
Levi's Women's 501 Original Shorts (Also Available in Plus)

### Item 2: footwear
Amazon Essentials Women's Belice Ballet Flat

Citation / lineage

Base model: Qwen/Qwen3-4B-Instruct-2507
Training data: flavianv/apparel23-qwen32b-kept-outfits-with-products
Internal report: docs/qwen4b_apparel23_bundle_sft_report.md in the RecoRL repo

License

This model inherits the license of the base Qwen3-4B-Instruct model. See the Qwen model card for terms.

Downloads last month: 31

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for flavianv/qwen4b-apparel23-bundle-sft

Base model

Qwen/Qwen3-4B-Instruct-2507

Finetuned

(1752)

this model

flavianv
/

qwen4b-apparel23-bundle-sft