woohello/olmo3-190m-zh-sft

SFT (有监督微调) 版本：基于 woohello/olmo3-190m-zh-nano 继续 SFT，学习指令遵循能力，从"续写文本"转向"扮演 assistant 回答"。

训练配置

Base model: woohello/olmo3-190m-zh-nano (26M, OLMo3 arch, SDPA)
数据: cmz1024/llm101-olmo3-zh-demo-data/sft/sft_t2t_mini.jsonl (对话格式)
LR: 5e-5 (SFT 比 pretrain 低 10x)
Warmup: 5%
Assistant-only loss: True
训练: RTX 3090 (24GB), bf16, attn_implementation=sdpa

用法

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("woohello/olmo3-190m-zh-sft", attn_implementation="sdpa")
tok = AutoTokenizer.from_pretrained("woohello/olmo3-190m-zh-sft")

messages = [{"role": "user", "content": "你好"}]
input_ids = tok.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
out = model.generate(input_ids, max_new_tokens=256, do_sample=True, temperature=0.7)
print(tok.decode(out[0][input_ids.shape[1]:], skip_special_tokens=True))

Downloads last month: 16

Safetensors

Model size

22M params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for woohello/olmo3-190m-zh-sft

Unable to build the model tree, the base model loops to the model itself. Learn more.

woohello
/

olmo3-190m-zh-sft

woohello/olmo3-190m-zh-sft

训练配置

用法

Model tree for woohello/olmo3-190m-zh-sft

Space using woohello/olmo3-190m-zh-sft 1